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SEED TRAIT GENES 



RELATED APPLICATION INFORMATION 

The present invention claims the benefit from US Provisional Patent Application Serial 

Nos. 60/166,228 filed November 17, 1999 and 60/197,899 filed April 17, 2000 and "Plant Trait 
Modification III" filed August 22, 2000. 

FIELD OF THE INVENTION 

This invention relates to the field of plant biology. More particularly, the present 

invention pertains to compositions and methods for phenotypically modifying a plant. 

BACKGROUND OF THE INVENTION 

Transcription factors can modulate gene expression, cither increasing or 

decreasing (inducing or repressing) the rate of transcription. This modulation results in 
differential levels of gene expression at various developmental stages^ in different tissues and cell 
types, and in response to different exogenous (e.g., environmental) and endogenous stimuli 
throughout the life cycle of the organism. 

Because transcription factors are key controlling elements of biological 
pathways, altering the expression levels of one or more transcription factors can change entire 
biological pathways in an organism. For example, manipulation of the levels of selected 
transcription factors may result in increased expression of economically useful proteins or 
metabolic chemicals in plants or to improve other agriculturally relevant characteristics. 
Conversely, blocked or reduced expression of a transcription factor may reduce biosynthesis of 
unwanted compounds or remove an undesirable trait. Therefore, manipulating transcription 
factor levels in a plant offers tremendous potential in agricultural biotechnology for modifying a 
plant's traits. 

The present invention provides novel transcription factors useful for modifying a 
plant's phenotype in desirable ways, such as modifying the characteristics of a plant's seed. 

SUMMARY OF THE INVENTION 

In a first aspect, the invention relates to a recombinant polynucleotide comprising 

a nucleotide sequence selected from the group consisting of: (a) a nucleotide sequence encoding a 
polypeptide comprising a sequence selected from SEQ ID Nos. 2N, where N=l-27, or a 
complementary nucleotide sequence thereof; (b) a nucleotide sequence encoding a polypeptide 
comprising a conservatively substituted variant of a polypeptide of (a); (c) a nucleotide sequence 
comprising a sequence selected from those of SEQ ID Nos. 2N-1 , where N=l -27, or a 
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complementary nucleotide sequence thereof; (d) a nucleotide sequence comprising silent 
substitutions in a nucleotide sequence of (c); (e) a nucleotide sequence which hybridizes under 
stringent conditions over substantially the entire length of a nucleotide sequence of one or more 
of: (a), (b), (c), or (d); (f) a nucleotide sequence comprising at least 15 consecutive nucleotides of 
5 a sequence of any of (a)-(e); (g) a nucleotide sequence comprising a subsequence or fragment of 
any of (a)-(f), which subsequence or fragment encodes a polypeptide having a biological activity 
that modifies a plant's seed characteristics; (h) a nucleotide sequence having at least 31% 
sequence identity to a nucleotide sequence of any of (a)-(g); (i) a nucleotide sequence having at 
least 60% identity sequence identity to a nucleotide sequence of any of (a)-(g); (j) a nucleotide 
10 sequence which encodes a polypeptide having at least 31% identity sequence identity to a 
polypeptide of SEQ ID Nos. 2N, where N=l-27; (k) a nucleotide sequence which encodes a 
polypeptide having at least 60% identity sequence identity to a polypeptide of SEQ ID Nos. 2N, 
where N=l-27; and (1) a nucleotide sequence which encodes a conserved domain of a polypeptide 
having at least 65% sequence identity to a . conserved domain of a polypeptide of SEQ ID Nos. 
15 2N, where N=l-27. The recombinant polynucleotide may further comprise a constitutive, 

inducible, or tissue -active promoter operably linked to the nucleotide sequence. The invention 
also relates to compositions comprising at least two of the above described polynucleotides. 

In a second aspect, the invention is an isolated or recombinant polypeptide 
comprising a subsequence of at least about 15 contiguous amino acids encoded by the 
20 recombinant or isolated polynucleotide described above. 

In another aspect, the invention is a transgenic plant comprising one or more of 
the above described recombinant polynucleotides. In yet another aspect, the invention is a plant 
with altered expression levels of a polynucleotide described above or a plant with altered 
expression or activity levels of an above described polypeptide. Further, the invention may be a 
25 plant lacking a nucleotide sequence encoding a polypeptide described above. The plant may be a 
soybean, wheat, corn, potato, cotton, rice, oilseed rape, sunflower, alfalfa, sugarcane, turf, 
banana, blackberry, blueberry, strawberry, raspberry, cantaloupe, carrot, cauliflower, coffee, 
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, or 
30 vegetable brassicas plant. 

In a further aspect, the invention relates to a cloning or expression vector 
comprising the isolated or recombinant polynucleotide described above or cells comprising the 
cloning or expression vector. 
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. In yet a further aspect, the invention relates to a composition produced by 
incubating a polynucleotide of the invention with a nuclease, a restriction enzyme, a polymerase; 
a polymerase and a primer; a cloning vector, or with a cell. 

Furthermore, the invention relates to a method for producing a plant having 
improved seed traits. The method comprises altering the expression of an isolated or recombinant 
polynucleotide of the invention or altering the expression or activity of a polypeptide of the 
invention in a plant to produce a modified plant, and selecting the modified plant for modified 
seed traits. 

In another aspect, the invention relates to a method of identifying a factor that is 
modulated by or interacts with a polypeptide encoded by a polynucleotide of the invention. The 
method comprises expressing a polypeptide encoded by the polynucleotide in a plant; and 
identifying at least one factor that is modulated by or interacts with the polypeptide. In one 
embodiment the method for identifying modulating or interacting factors is by detecting binding 
by the polypeptide to a promoter sequence, or by detecting interactions between an additional 
protein and the polypeptide in a yeast two hybrid system, or by detecting expression of a factor by 
hybridization to a microarray, subtractive hybridization or differential display. 

In yet another aspect, the invention is a method of identifying a molecule that 
modulates activity or expression of a polynucleotide or polypeptide of interest. The method 
comprises placing the molecule in contact with a plant comprising the polynucleotide or 
polypeptide encoded by the polynucleotide of the invention and monitoring one or more of the 
expression level of the polynucleotide in the plant, the expression level of the polypeptide in the 
plant, and modulation of an activity of the polypeptide in the plant. 

In yet another aspect, the invention relates to an integrated system, computer or 
computer readable medium comprising one or more character strings corresponding to a 
polynucleotide of the invention, or to a polypeptide encoded by the polynucleotide. The 
integrated system, computer or computer readable medium may comprise a link between one or 
more sequence strings to a modified plant seed trait. 

In yet another aspect, the invention is a method for identifying a sequence similar 
or homologous to one or more polynucleotides of the invention, or one or more polypeptides 
encoded by the polynucleotides. The method comprises providing a sequence database; and, 
querying the sequence database with one or more target sequences corresponding to the one or 
more polynucleotides or to the one or more polypeptides to identify one or more sequence 
members of the database that display sequence similarity or homology to one or more of the one 
or more target sequences. 

3 
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The method may further comprise of linking the one or more of the 
polynucleotides of the invention, or encoded polypeptides, to a modified plant seed 
characteristics phenotype. 

BRIEF DESCRIPTION OF THE DRAWINGS 

5 Figure 1 provides a table of exemplary polynucleotide and polypeptide sequences of the 

invention. The table includes from left to right for each sequence: the SEQ ID No., the internal 
code reference number (GID), whether the sequence is a polynucleotide or polypeptide sequence, 
and identification of any conserved domains for the polypeptide sequences. 

Figure 2 provides a table of exemplary sequences that are homologous to other sequences 
10 provided in the Sequence Listing and that are derived from Arabidopsis thaliana. The table 
includes from left to right: the SEQ ID No., the internal code reference number (GID), 
identification of the homologous sequence, whether the sequence is a polynucleotide or 
polypeptide sequence, and identification of any conserved domains for the polypeptide 
sequences. 

15 Figure 3 provides a table of exemplary sequences that are homologous to the sequences 

provided in Figures 1 and 2 and that are derived from plants other than Arabidopsis thaliana. The 
table includes from left to right: the SEQ ID No., the internal code reference number (GID), the 
unique GenBank sequence ID No. (NID), the probability that the comparison was generated by 
chance (P-value), and the species from which the homologous gene was identified. 

20 

DETAILED DESCRIPTION 

The present invention relates to polynucleotides and polypeptides, e.g. for 

modifying phenotypes of plants. 

In particular, the polynucleotides or polypeptides are useful for modifying traits 

25 associated with a plant's seed characteristics when the expression levels of the polynucleotides or 
expression levels or activity levels of the polypeptides are altered. Specifically, the 
polynucleotides and polypeptides are useful for modifying the nutritional content or composition 
of seeds: such as to modify the protein or oil content of seeds, to modify insoluble sugar content 
or composition, such as by altering the levels of arabinose, fucose, galactose, mannose, rhamnose 

30 or xylose or the like; modify prenyl lipid content or composition, such as by altering the levels of 
lutein, beta-carotene, xanthophyll-1, xanthophyll-2, chlorophylls A or B, or alpha-, delta- or 
gamma-tocopherol or the like; modify fatty acid content or composition, such as by altering the 
levels of the fatty acids 16:0 (palmitic acid), 16:1 (palmitoleic acid), 18:0 (stearic acid), 18:1 
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(oleic acid), 18:2 (linoleic acid), 20:0 , 18:3 (linolenic acid), 20:1 (eicosenoic acid), 20:2 and 22:1 
(erucic acid); modify wax composition or content, such as by altering the levels of C29, C3 1, or 
C33 alkanes; modify sterol composition or content, such as by altering the levels of 
brassicasterol, campesterol, stigmasterol, sitosterol or stigmastanol or the like, or modify 
5 glucosinolate composition or content. 

Other seed characteristics that may be modified include traits relating to a seed's 
germination characteristics; shelf-life; drydown characteristics; size; stress responses, such as to 
heat, cold, salt or osmotic shock; other nutritional content, such as vitamins, minerals, or 
flavonoids; seedling vigor; pest resistance, or seed coat quality, resistance to pathogens, 

10 germination rate, resistance to heavy metals and toxins. Yet another desirable phenotype is a 
change in the overall gene expression pattern of the seed. 

The polynucleotides of the invention encode plant transcription factors. The plant 
transcription factors are derived, e.g., from Arabidopsis thaliana and can belong, e.g., to one or 
more of the following transcription factor families: the AP2 (APETALA2) domain transcription 

15 factor family (Riechmann and Meyerowitz (1998) J. Biol. Chem. 379:633-646); the MYB 

transcription factor family (Martin and Paz-Ares (1997) Trends Genet. 13:67-73); the MADS 
domain transcription factor family (Riechmann and Meyerowitz (1997) J. Biol. Chem 378:1079- 
1 101); the WRKY protein family (Ishiguro and Nakamura (1 994) Mol.Gen. Genet. 244:563- 
571); the ankyrin-repeat protein family (Zhang et al. (1992) Plant Cell 4: 1575-1 588); the 

20 miscellaneous protein (MISC) family (Kim et al. (1997) Plant J. 1 1:1237-1251); the zinc finger 
protein (Z) family (Klug and Schwabe (1995) FASEB J. 9: 597-604); the homeobox (HB) protein 
family (Duboule (1994) Guidebook to the Homeobox Genes. Oxford University Press); the 
CAAT-element binding proteins (Forsburg and Guarente (1989) Genes Dev. 3: 1 166-1 178); the 
squamosa promoter binding proteins (SPB) (Klein et al. (1996) Mol. Gen. Genet. 1996 250:7-16); 

25 the NAM protein family; the IAA/AUX proteins (Rouse et al. (1998) Science 279:1371-1373); 
the HLH/MYC protein family (Littlewood et al. (1994) Prot. Profile 1:639-709); the DNA- 
binding protein (DBP) family (Tucker et al. (1994) EMBO J. 13:2994-3002); the bZIP family of 
transcription factors (Foster et al. (1994) FASEB J. 8:192-200); the BPF-1 protein (Box P- 
binding factor) family (da Costa e Silva et al. (1993) Plant J. 4: 125-135); and the golden protein 

30 (GLD) family (Hall et al. (1998) Plant Cell 10:925-936). 

In addition to methods for modifying a plant phenotype by employing one or 
more polynucleotides and polypeptides of the invention described herein, the polynucleotides 
and polypeptides of the invention have a variety of additional uses. These uses include their use 
in the recombinant production (i.e, expression) of proteins; as regulators of plant gene expression, 

5 
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as diagnostic probes for the presence of complementary or partially complementary nucleic acids 
(including for detection of natural coding nucleic acids); as substrates for further reactions, e.g., 
mutation reactions, PCR reactions, or the like, of as substrates for cloning e.g., including 
digestion or ligation reactions, and for identifying exogenous or endogenous modulators of the 
5 transcription factors. 

DEFINITIONS 

A "polynucleotide" is a nucleic acid sequence comprising a plurality of 
polymerized nucleotide residues, e.g., at least about 15 consecutive polymerized nucleotide 
residues, optionally at least about 30 consecutive nucleotides, at least about 50 consecutive 
10 nucleotides. In many instances, a polynucleotide comprises a nucleotide sequence encoding a 
polypeptide (or protein) or a domain or fragment thereof. Additionally, the polynucleotide may 
. comprise a promoter, an intron, an enhancer region, a polyadenylation site, a translation initiation 
site, 5* or 3' untranslated regions, a reporter gene, a selectable marker, or the like. The 
polynucleotide can be single stranded or double stranded DNA or RNA. The polynucleotide 
1 5 optionally comprises modified bases or a modified backbone. The polynucleotide can be, e.g., 

genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a cloned DNA, 
a synthetic DNA or RNA, or the like. The polynucleotide can comprise a sequence in either 
sense or antisense orientations. 

A "recombinant polynucleotide" is a polynucleotide that is not in its native state, 
20 e.g., the polynucleotide comprises a nucleotide sequence not found in nature, or the 

polynucleotide is in a context other than that in which it is naturally found, e.g., separated from 
nucleotide sequences with which it typically is in proximity in nature, or adjacent (or contiguous 
with) nucleotide sequences with which it typically is not in proximity. For example, the sequence 
at issue can be cloned into a vector, or otherwise recombined with one or more additional nucleic 
25 acid. 

An "isolated polynucleotide" is a polynucleotide whether naturally occurring or 
recombinant, that is present outside the cell in which it is typically found in nature, whether 
purified or not. Optionally, an isolated polynucleotide is subject to one or more enrichment or 
purification procedures, e.g., cell lysis, extraction, centrifugation, precipitation, or the like. 
30 A "recombinant polypeptide" is a polypeptide produced by translation of a 

recombinant polynucleotide. An "isolated polypeptide," whether a naturally occurring or a 
recombinant polypeptide, is more enriched in (or out of) a cell than the polypeptide in its natural 
state in a wild type cell, e.g., more than about 5% enriched, more than about 10% enriched, or 



BNSDOCID: <WO. 



_0135727A1 I > 



WO 01/35727 PCT/US00/31457 

? 

more than about 20%, or more than about 50%, or more, enriched, i.e., alternatively denoted: 
105%, 1 10%, 120%, 150% or more, enriched relative to wild type standardized at 100%. Such an 
enrichment is not the result of a natural response of a wild type plant. Alternatively, or 
additionally, the isolated polypeptide is separated from other cellular components with which it is 
5 typically associated, e.g., by any of the various protein purification methods herein. 

The term "transgenic plant" refers to a plant that contains genetic material, not 
found in a wild type plant of the same species, variety or cultivar. The genetic material may 
include a transgene, an insertional mutagenesis event (such as by transposon or T-DNA 
insertional mutagenesis), an activation tagging sequence, a mutated sequence, a homologous 
10 recombination event or a sequence modified by chimeraplasty. Typically, the foreign genetic 
material has been introduced into the plant by human manipulation. 

A transgenic plant may contain an expression vector or cassette. The expression 
cassette typically comprises a polypeptide-encoding sequence operably linked (i.e., under 
regulatory control of) to appropriate inducible or constitutive regulatory sequences that allow for 
15 the expression of polypeptide. The expression cassette can be introduced into a plant by 

transformation or by breeding after transformation of a parent plant. A plant refers to a whole 
plant as well as to a plant part, such as seed, fruit, leaf, or root, plant tissue, plant cells or any 
other plant material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems 
that mimic biochemical or cellular components or processes in a cell. 
20 The phrase "ectopically expression or altered expression" in reference to a 

polynucleotide indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is 
different from the expression pattern in a wild type plant or a reference plant of the same species. 
For example, the polynucleotide or polypeptide is expressed in a cell or tissue type other than a 
cell or tissue type in which the sequence is expressed in the wild type plant, or by expression at a 
25 time other than at the time the sequence is expressed in the wild type plant, or by a response to 
different inducible agents, such as hormones or environmental signals, or at different expression 
levels (either higher or lower) compared with those found in a wild type plant. The term also 
refers to altered expression patterns that are produced by lowering the levels of expression to 
below the detection level or completely abolishing expression. The resulting expression pattern 
30 can be transient or stable, constitutive or inducible. In reference to a polypeptide, the term 

"ectopic expression or altered expression" further may relate to altered activity levels resulting 
from the interactions of the polypeptides with exogenous or endogenous modulators or from 
interactions with factors or as a result of the chemical modification of the polypeptides. 
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The term "fragment" or "domain," with respect to a polypeptide, refers to a 
subsequence of the polypeptide. In some cases, the fragment or domain, is a subsequence of the 
polypeptide which performs at least one biological function of the intact polypeptide in 
. substantially the same manner, or to a similar extent, as does the intact polypeptide. For example, 
a polypeptide fragment can comprise a recognizable structural motif or functional domain such as 
a DNA binding domain that binds to a DNA promoter region, an activation domain or a domain 
for protein-protein interactions. Fragments can vary in size from as few as 6 amino acids to the 
full length of the intact polypeptide, but are preferably at least about 30 amino acids in length and 
more preferably at least about 60 amino acids in length. In reference to a nucleotide sequence, "a 
fragment" refers to any subsequence of a polynucleotide, typically, of at least consecutive about 
15 nucleotides, preferably at least about 30 nucleotides, more preferably at least about 50, of any 
of the sequences provided herein. 

The term "trait" refers to a physiological, morphological, biochemical or physical 
characteristic of a plant or particular plant material or cell. In some instances, this characteristic 
is visible to the human eye, such as seed or plant size, or can be measured by available 
biochemical techniques, such as the protein, starch or oil content of seed or leaves or by the 
observation of the expression level of genes, e.g., by employing Northern analysis, RT-PCR, 
microarray gene expression assays or reporter gene expression systems, or by agricultural 
observations such as stress tolerance, yield or pathogen tolerance. 

"Trait modification" refers to a detectable difference in a characteristic in a plant 
ectopically expressing a polynucleotide or polypeptide of the present invention relative to a plant 
not doing so, such as a wild type plant. In some cases, the trait modification can be evaluated 
quantitatively. For example, the trait modification can entail at least about a 2% increase or 
decrease in an observed trait (difference), at least a 5% difference, at least about a 10% 
difference, at least about a 20% difference, at least about a 30%, at least about a 50%, at least 
about a 70%, or at least about a 100%, or an even greater difference. It is known that there can be 
a natural variation in the modified trait. Therefore, the trait modification observed entails a 
change of the normal distribution of the trait in the plants compared with the distribution 
observed in wild type plant. 

Trait modifications of particular interest include those to seed ( such as embryo 
or endosperm), fruit, root, flower, leaf, stem, shoot, seedling or the like, including: enhanced 
tolerance to environmental conditions including freezing, chilling, heat, drought, water saturation, 
radiation and ozone; improved tolerance to microbial, fungal or viral diseases; improved 
tolerance to pest infestations, including nematodes, mollicutes, parasitic higher plants or the like; 

8 
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decreased herbicide sensitivity; improved tolerance of heavy metals or enhanced ability to take up 
heavy metals; improved growth under poor photoconditions (e.g., low light and/or short day 
length), or changes in expression levels of genes of interest. Other phenotype that can be 
modified relate to the production of plant metabolites, such as variations in the production of 
5 taxol, tocopherol, tocotrienol, sterols, phytosterols, vitamins, wax monomers, anti-oxidants, 
amino acids, lignins, cellulose, tannins, preriyllipids (such as chlorophylls and carotenoids), 
glucosinolates, and terpenoids, enhanced or compositionally altered protein or oil production 
(especially in seeds), or modified sugar (insoluble or soluble) and/or starch composition. 
Physical plant characteristics that can be modified include cell development (such as the number 

1 0 of trichomes), fruit and seed size and number, yields of plant parts such as stems, leaves and 

roots, the stability of the seeds during storage, characteristics of the seed pod (e.g., susceptibility 
to shattering), root hair length and quantity, internode distances, or the quality of seed coat. Plant 
growth characteristics that can be modified include growth rate, germination rate of seeds, vigor 
of plants and seedlings, leaf and flower senescence, male sterility, apomixis, flowering time, 

15 flower abscission, rate of nitrogen uptake, biomass or transpiration characteristics, as well as 

plant architecture characteristics such as apical dominance, branching patterns, number of organs, 
organ identity, organ shape or size. 

POLYPEPTIDES AND POLYNUCLEOTIDES OF THE INVENTION 

The present invention provides, among other things, transcription factors (TFs), 
20 and transcription factor homologue polypeptides, and isolated or recombinant polynucleotides 

encoding the polypeptides. These polypeptides and polynucleotides may be employed to modify 

a plant's seed characteristics. 

Exemplary polynucleotides encoding the polypeptides of the invention were 

identified in the Arabidopsis thaliana GenBank database using publicly available sequence 
25 analysis programs and parameters. Sequences initially identified were then further characterized 

to identify sequences comprising specified sequence strings corresponding to sequence motifs 

present in families of known transcription factors. Polynucleotide sequences meeting such 

criteria were confirmed as transcription factors. 

Additional polynucleotides of the invention were identified by screening 
30 Arabidopsis thaliana and/or other plant cDNA libraries with probes corresponding to known 

transcription factors under low stringency hybridization conditions. Additional sequences, 

including full length coding sequences were subsequently recovered by the rapid amplification of 

cDNA ends (RACE) procedure, using a commercially available kit according to the 
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manufacturer's instructions. Where necessary, multiple rounds of RACE are performed to isolate 
5' and 3' ends. The full length cDNA was then recovered by a routine end-to-end polymerase 
chain reaction (PCR) using primers specific to the isolated 5' and 3' ends. Exemplary sequences 
are provided in the Sequence Listing. 
5 The polynucleotides of the invention were ectopically expressed in overexpressor 

or knockout plants and changes in the seed characteristics of the plants were observed. 
Therefore, the polynucleotides and polypeptides can be employed to improve the seed 
characteristics of plants. 

Making polynucleotides 

1 0 The polynucleotides of the invention include sequences that encode transcription 

factors and transcription factor homologue polypeptides and sequences complementary thereto, as 
well as unique fragments of coding sequence, or sequence complementary thereto. Such 
polynucleotides can be, e.g., DNA or RNA, e.g., mRNA, cRNA, synthetic RNA, genomic DNA, 
cDNA synthetic DNA, oligonucleotides, etc. The polynucleotides are either double-stranded or 

1 5 single-stranded, and include either, or both sense (i.e., coding) sequences and antisense (i.e., non- 
coding, complementary) sequences. The polynucleotides include the coding sequence of a 
transcription factor, or transcription factor homologue polypeptide, in isolation, in combination 
with additional coding sequences (e.g., a purification tag, a localization signal, as a fusion- 
protein, as a pre-protein, or the like), in combination with non-coding sequences (e.g., introns or 

20 inteins, regulatory elements such as promoters, enhancers, terminators, and the like), and/or in a 
vector or host environment in which the polynucleotide encoding a transcription factor or 
transcription factor homologue polypeptide is an endogenous or exogenous gene. 

A variety of methods exist for producing the polynucleotides of the invention. 
Procedures for identifying and isolating DNA clones are well known to those of skill in the art, 

25 and are described in, e.g., Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods 
in Enzymology volume 152 Academic Press, Inc., San Diego, CA ("Berger"); Sambrook et al., 
Molecular Cloning - A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, 
Cold Spring Harbor, New York, 1989 ("Sambrook") and Current Protocols in Molecular Biology . 
F.M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing 

30 Associates, Inc. and John Wiley & Sons, Inc., (supplemented through 2000) ("Ausubel"). 

Alternatively, polynucleotides of the invention, can be produced by a variety of 
in vitro amplification methods adapted to the present invention by appropriate selection of 
specific or degenerate primers. Examples of protocols sufficient to direct persons of skill through 
in vitro amplification methods, including the polymerase chain reaction (PCR) the ligase chain 
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reaction (LCR), Qbeta-replicase amplification and other RNA polymerase mediated techniques 
(e.g., NASBA), e.g., for the production of the homologous nucleic acids of the invention are 
found in Berger, Sambrook, and Ausubel, as well as Mullis et al., (1987) PCR Protocols A Guide 
to Methods and Applications (Innis et al. eds) Academic Press Inc. San Diego, CA (1990) (Innis). 
5 Improved methods for cloning in vitro amplified nucleic acids are described in Wallace et al., 
U.S. Pat. No. 5,426,039. Improved methods for amplifying large nucleic acids by PCR are 
summarized in Cheng et al. (1994) Nature 369: 684-685 and the references cited therein, in which 
PCR amplicons of up to 40kb are generated. One of skill will appreciate that essentially any 
RNA can be converted into a double stranded DNA suitable for restriction digestion, PCR 

1 0 expansion and sequencing using reverse transcriptase and a polymerase. See y e.g., Ausubel, 
Sambrook and Berger, all supra. 

Alternatively, polynucleotides and oligonucleotides of the invention can be 
assembled from fragments produced by solid-phase synthesis methods. Typically, fragments of 
up to approximately 100 bases arc individually synthesized and then enzymatically or chemically 

1 5 ligated to produce a desired sequence, e.g., a polynucleotide encoding all or part of a 

transcription factor. For example, chemical synthesis using the phosphoramidite method is 
described, e.g., by Beaucage et al. ( 1 98 1 ) Tetrahedron Letters 22: 1 859-69; and Matthes et al. 
(1984) EMBO J. 3:801-5. According to such methods, oligonucleotides are synthesized, purified, 
annealed to their complementary strand, ligated and then optionally cloned into suitable vectors. 

20 And if so desired, the polynucleotides and polypeptides of the invention can be custom ordered 
from any of a number of commercial suppliers. 

HOMOLOGOUS SEQUENCES 

Sequences homologous, i.e., that share significant sequence identity or similarity, 
to those provided in the Sequence Listing, derived from Arabidopsis thaliana or from other plants 

25 of choice are also an aspect of the invention. Homologous sequences can be derived from any 
plant including monocots and dicots and in particular agriculturally important plant species, 
including but not limited to, crops such as soybean, wheat, corn, potato, cotton, rice, oilseed rape 
(including canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as 
banana, blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, cauliflower, coffee, 

30 cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet com, tobacco, tomato, watermelon, rosaceous fruits (such as 
apple, peach, pear, cherry and plum) and vegetable brassicas (such as broccoli, cabbage, 
cauliflower, brussel sprouts and kohlrabi). Other crops, fruits and vegetables whose phenotype 
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can be changed include barley, rye, millet, sorghum, currant, avocado, citrus fruits such as 
oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the walnut and 
peanut, endive, leek, roots, such as arrowroot, beet, cassava, tumip, radish, yam, and sweet 
potato, and beans. The homologous sequences may also be derived from woody species, such 
5 pine, poplar and eucalyptus. 

Transcription factors that are homologous to the listed sequences will typically 
share at least about 31% amino acid sequence identity. More closely related transcription factors 
can share at least about 50%, about 60%, about 65%, about 70%, about 75% or about 80% or 
about 90% or about 95% or about 98% or more sequence identity with the listed sequences. 

10 Factors that are most closely related to the listed sequences share, e.g., at least about 85%, about 
90% or about 95% or more % sequence identity to the listed sequences. At the nucleotide level, 
the sequences will typically share at least about 40% nucleotide sequence identity, preferably at 
least about 50%, about 60%, about 70% or about 80% sequence identity, and more preferably 
about 85%, about 90%, about 95% or about 97% or more sequence identity to one or more of the 

1 5 listed sequences. The degeneracy of the genetic code enables major variations in the nucleotide 
sequence of a polynucleotide while maintaining the amino acid sequence of the encoded protein. 
Conserved domains within a transcription factor family may exhibit a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative substitutions, and 
preferably at least 80% sequence identity. 

20 Identifying Nucleic Acids by Hybridization 

Polynucleotides homologous to the sequences illustrated in the Sequence Listing 

can be identified, e.g., by hybridization to each other under stringent or under highly stringent 

conditions. Single stranded polynucleotides hybridize when they associate based on a variety of 

well characterized physico-chemical forces, such as hydrogen bonding, solvent exclusion, base 

25 stacking and the like. The stringency of a hybridization reflects the degree of sequence identity 
of the nucleic acids involved, such that the higher the stringency, the more similar are the two 
polynucleotide strands. Stringency is influenced by a variety of factors, including temperature, 
salt concentration and composition, organic and non-organic additives, solvents, etc. present in 
both the hybridization and wash solutions and incubations (and number), as described in more 

30 detail in the references cited above. 

An example of stringent hybridization conditions for hybridization of 
complementary nucleic acids which have more than 100 complementary residues on a filter in a 
Southern or northern blot is about 5°C to 20°C lower than the thermal melting point (Tm) for the 
specific sequence at a defined ionic strength and pH. The T m is the temperature (under defined 
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ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched 
probe. Nucleic acid molecules that hybridize under stringent conditions will typically hybridize 
to a probe based on either the entire cDNA or selected portions, e.g., to a unique subsequence, of 
the cDNA under wash conditions of 0.2x SSC to 2.0 X SSC, 0.1% SDS at 50-65° C, for example 
5 0.2 x SSC, 0. 1 % SDS at 65° C. For identification of less closely related homologues washes can 
be performed at a lower temperature, e.g., 50° C. In general, stringency is increased by raising 
the wash temperature and/or decreasing the concentration of SSC. 

As another example, stringent conditions can be selected such that an 
oligonucleotide that is perfectly complementary to the coding oligonucleotide hybridizes to the 

10 coding oligonucleotide with at least about a 5-1 Ox higher signal to noise ratio than the ratio for 
hybridization of the perfectly complementary oligonucleotide to a nucleic acid encoding a 
transcription factor known as of the filing date of the application. Conditions can be selected 
such that a higher signal to noise ratio is observed in the particular assay which is used, e.g., 
about 15x, 25x, 35x, 50x or more. Accordingly, the subject nucleic acid hybridizes to the unique 

1 5 coding oligonucleotide with at least a 2x higher signal to noise ratio as compared to hybridization 
of the coding oligonucleotide to a nucleic acid encoding known polypeptide. Again, higher 
signal to noise ratios can be selected, e.g., about 5x, lOx, 25x, 35x, 50x or more. The particular 
signal will depend on the label used in the relevant assay, e.g., a fluorescent label, a colorimetric 
label, a radio active label, or the like. 

20 Alternatively, transcription factor homologue polypeptides can be obtained by 

screening an expression library using antibodies specific for one or more transcription factors. 
With the provision herein of the disclosed transcription factor, and transcription factor homologue 
nucleic acid sequences, the encoded polypeptide(s) can be expressed and purified in a 
heterologous expression system (e.g., E. coli) and used to raise antibodies (monoclonal or 

25 polyclonal) specific for the polypcptidc(s) in question. Antibodies can also be raised against 
synthetic peptides derived from transcription factor, or transcription factor homologue, amino 
acid sequences. Methods of raising antibodies are well known in the art and are described in 
Harlow and Lane (1 988) Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory, New 
York. Such antibodies can then be used to screen an expression library produced from the plant 

30 from which it is desired to clone additional transcription factor homologues, using the methods 
described above. The selected cDNAs can be confirmed by sequencing and enzymatic activity. 
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SEQUENCE VARIATIONS 

It will readily be appreciated by those of skill in the art, that any of a variety of 
polynucleotide sequences are capable of encoding the transcription factors and transcription 
factor homologue polypeptides of the invention. Due to the degeneracy of the genetic code, 
5 many different polynucleotides can encode identical and/or substantially similar polypeptides in 
addition to those sequences illustrated in the Sequence Listing. 

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA, TCC, 
TCG, and TCT all encode the same amino acid: serine. Accordingly, at each position in the 
sequence where there is a codon encoding serine, any of the above trinucleotide sequences can be 
1 0 used without altering the encoded polypeptide. 

Table 1 



Amino acids 


Codon 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 


C 


TGC 


TGT 










Aspartic acid 


Asp 


D 


GAC 


GAT 










Glutamic acid 


Glu 


E 


GAA 


GAG 










Phenylalanine 


Phe 


F 


TTC 


TTT 










Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGT 






Histidine 


His 


H 


CAC 


CAT 










Isoleucine 


lie 


I 


ATA 


ATC 


ATT 








Lysine 


Lys 


K 


AAA 


AAG 










Leucine 


Leu 


L 


TTA 


TTG 


CTA 


CTC 


CTG 


CTT 


Methionine 


Met 


M 


ATG 












Asparagine 


Asn 


N 


AAC 


AAT 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


CCT 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


Serine 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






Valine 


Val 


V 


GTA 


GTC 


GTG 


GTT 






Tryptophan 


Trp 


W 


TGG 












Tyrosine 


Tyr 


Y 


TAC 


TAT 











Sequence alterations that do not change the amino acid sequence encoded by the 
15 polynucleotide are termed "silent" variations. With the exception of the codons ATG and TGG, 
encoding methionine and tryptophan, respectively, any of the possible codons for the same amino 
acid can be substituted by a variety of techniques, e.g., site-directed mutagenesis, available in the 
art. Accordingly, any and all such variations of a sequence selected from the above table are a 
feature of the invention. 
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In addition to silent variations, other conservative variations that alter one, or a 
few amino acids in the encoded polypeptide, can be made without altering the function of the 
polypeptide, these conservative variants are, likewise, a feature of the invention. 

For example, substitutions, deletions and insertions introduced into the sequences 
5 provided in the Sequence Listing are also envisioned by the invention. Such sequence 

modifications can be engineered into a sequence by site-directed mutagenesis (Wu (ed.) Meth. 
Enzvmol . ( 1 993) vol. 2 1 7, Academic Press) or the other methods noted below. Amino acid 
substitutions are typically of single residues; insertions usually will be on the order of about from 
1 to 10 amino acid residues; and deletions will range about from 1 to 30 residues. In preferred 

10 embodiments, deletions or insertions are made in adjacent pairs, e.g., a deletion of two residues or 
insertion of two residues. Substitutions, deletions, insertions or any combination thereof can be 
combined to arrive at a sequence. The mutations that are made in the polynucleotide encoding the 
transcription factor should not place the sequence out of reading frame and should not create 
complementary regions that could produce secondary mRNA structure. Preferably, the 

1 5 polypeptide encoded by the DNA performs the desired function. 

Conservative substitutions are those in which at least one residue in the amino 
acid sequence has been removed and a different residue inserted in its place. Such substitutions 
generally are made in accordance with the Table 2 when it is desired to maintain the activity of 
the protein. Table 2 shows amino acids which can be substituted for an amino acid in a protein 

20 and which are typically regarded as conservative substitutions. 
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Table 2 



Residue 


Conservative Substitutions 


Ala 


Ser 


Arg 


Lys 


Asn 


Gin; His 


Asp 


Glu 


Gin 


Asn 


Cys 


Ser 


Glu 


Asp 


Gly 


Pro 


His 


Asn; Gin 


He 


Leu, Val 


Leu 


He; Val 


Lys 


Arg; Gin 


Met 


Leu; He 


Phe 


Met; Leu; Tyr 


Ser 


Thr; Gly 


Thr , 


Ser;Val 


Tip 


Tyr 


Tyr 


Trp; Phe 


Val 


He; Leu 



Substitutions that are less conservative than those in Table 2 can be selected by 
picking residues that differ more significantly in their effect on maintaining (a) the structure of 
5 the polypeptide backbone in the area of the substitution, for example, as a sheet or helical 

conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of 
the side chain. The substitutions which in general are expected to produce the greatest changes in 
protein properties will be those in which (a) a hydrophilic residue, e.g., seryl or thrconyl, is 
substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; 
10 (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., 
phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine. 
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FURTHER MODIFYING SEQUENCES OF THE INVENTION— MUTATION/ FORCED 
EVOLUTION 

In addition to generating silent or conservative substitutions as noted, above, the 
present invention optionally includes methods of modifying the sequences of the Sequence 
5 Listing. In the methods, nucleic acid or protein modification methods are used to alter the given 
sequences to produce new sequences and/or to chemically or enzymatically modify given 
sequences to change the properties of the nucleic acids or proteins. 

Thus, in one embodiment, given nucleic acid sequences are modified, e.g., 
according to standard mutagenesis or artificial evolution methods to produce modified sequences. 
1 0 For example, Ausubel, supra, provides additional details on mutagenesis methods. Artificial 
forced evolution methods are described, e.g., by Stemmer (1994) Nature 370:389-391, and 
Stemmer ( 1 994) Proc. Natl. Acad. Sci. USA 91:1 0747- 10751. Many other mutation and 
evolution methods are also available and expected to be within the skill of the practitioner. 

Similarly, chemical or enzymatic alteration of expressed nucleic acids and 
1 5 polypeptides can be performed by standard methods. For example, sequence can be modified by 
. addition of lipids, sugars, peptides, organic or inorganic compounds, by the inclusion of modified 
nucleotides or amino acids, or the like. For example, protein modification techniques are 
illustrated in Ausubel, supra. Further details on chemical and enzymatic modifications can be 
found herein. These modification methods can be used to modify any given sequence, or to 
20 modify any sequence produced by the various mutation and artificial evolution modification 
methods noted herein. 

Accordingly, the invention provides for modification of any given nucleic acid 
by mutation, evolution, chemical or enzymatic modification, or other available methods, as well 
as for the products produced by practicing such methods, e.g., using the sequences herein as a 
25 starting substrate for the various modification approaches. 

For example, optimized coding sequence containing codons preferred by a 
particular prokaryotic or eukaryotic host can be used e.g., to increase the rate of translation or to 
produce recombinant RNA transcripts having desirable properties, such as a longer half-life, as 
compared with transcripts produced using a non-optimized sequence. Translation stop codons 
30 can also be modified to reflect host preference. For example, preferred stop codons for S. 
cerevisiae and mammals are TAA and TGA, respectively. The preferred stop codon for 
monocotyledonous plants is TGA, whereas insects and E. coli prefer to use TAA as the stop 
codon. 
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The polynucleotide sequences of the present invention can also be engineered in 
order to alter a coding sequence for a variety of reasons, including but not limited to, alterations 
which modify the sequence to facilitate cloning, processing and/or expression of the gene 
product. For example, alterations are optionally introduced using techniques which are well 
known in the art, e.g., site-directed mutagenesis, to insert new restriction sites, to alter 
glycosylation patterns, to change codon preference, to introduce splice sites, etc. 

Furthermore, a fragment or domain derived from any of the polypeptides of the 
invention can be combined with domains derived from other transcription factors or synthetic 
domains to modify the biological activity of a transcription factor. For instance, a DNA binding 
domain derived from a transcription factor of the invention can be combined with the activation 
domain of another transcription factor or with a synthetic activation domain. A transcription 
activation domain assists in initiating transcription from a DNA binding site. Examples include 
the transcription activation region of VP 16 or GAL4 (Moore et al. (1998) Proc. Natl. Acad. Sci. 
USA 95: 376-381; and Aoyamaetal. (1995) pjantCen 7:1773-1785), peptides derived from 
bacterial sequences (Ma and Ptashne (1 987) Cell 5 1 ; 1 1 3-1 19) and synthetic peptides (Giniger 
and Ptashne, (1987) Nature 330:670-672). 

EXPRESSION AND MODIFICATION OF POLYPEPTIDES 

Typically, polynucleotide sequences of the invention are incorporated into 
recombinant DNA (or RNA) molecules that direct expression of polypeptides of the invention in 
appropriate host cells, transgenic plants, in vitro translation systems, or the like. Due to the 
inherent degeneracy of the genetic code, nucleic acid sequences which encode substantially the 
same or a functionally equivalent amino acid sequence can be substituted for any listed sequence 
to provide for cloning and expressing the relevant homologue. 

Vectors. Promoters and Expression Systems 

The present invention includes recombinant constructs comprising one or more 
of the nucleic acid sequences herein. The constructs typically comprise a vector, such as a 
plasmid, a cosmid, a phage, a virus (e.g., a plant virus), a bacterial artificial chromosome (BAC), 
a yeast artificial chromosome (YAC), or the like, into which a nucleic acid sequence of the 
invention has been inserted, in a forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory sequences, including, for example, a 
promoter, operably linked to the sequence. Large numbers of suitable vectors and promoters are 
known to those of skill in the art, and are commercially available. 
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General texts which describe molecular biological techniques useful herein, 
including the use and production of vectors, promoters and many other relevant topics, include 
Berger, Sambrook and Ausubel, supra. Any of the identified sequences can be incorporated into a 
cassette or vector, e.g., for expression in plants. A number of expression vectors suitable for stable 
5 transformation of plant cells or for the establishment of transgenic plants have been described 
including those described in Weissbach and Weissbach, (1989J Methods for Plant Molecular 
Biology . Academic Press, and Gelvin et al., (1990) Plant Molecular Biology Manual , Kluwer 
Academic Publishers. Specific examples include those derived from a Ti plasmid of 
Agrobacterium tumefaciens, as well as those disclosed by Herrera-Estrella et al. (1983) Nature 

10 303: 209, Bevan (1984) NudAcidRes, 12: 871 1-8721, Klee (1985) Bio/Technology 3: 637-642, 
for dicotyledonous plants. 

Alternatively, non-Ti vectors can be used to transfer the DNA into 
monocotyledonous plants and cells by using free DNA delivery techniques. Such methods can 
involve, for example, the use of liposomes, electroporation, microprojectile bombardment, silicon 

15 carbide whiskers, and viruses. By using these methods transgenic plants such as wheat, rice 

(Christou (1 99 1) Bio/Technology 9: 957-962) and com (Gordon-Kamm (1990) Plant Cell 2: 603- 
618) can be produced. An immature embryo can also be a good target tissue for monocots for 
direct DNA delivery techniques by using the particle gun.(Weeks et al. (1993) Plant Physiol 102: 
1077-1084; Vasil n993 VBio/Technology 10: 667-674; Wan and Lemeaux (1994) Plant Physiol 

20 104: 37^8, and for Agrobacterium-mediated DNA transfer (Ishida et al. (1996) Nature Biotech 
14:745-750). 

Typically, plant transformation vectors include one or more cloned plant coding 
sequence (genomic or cDNA) under the transcriptional control of 5' and 3* regulatory sequences 
and a dominant selectable marker. Such plant transformation vectors typically also contain a 

25 promoter (e.g., a regulatory region controlling inducible or constitutive, environmentally-or 

developmentally-regulated, or cell- or tissue-specific expression), a transcription initiation start 
site, an RNA processing signal (such as intron splice sites), a transcription termination site, and/or 
a polyadenylation signal. 

Examples of constitutive plant promoters which can be useful for expressing the 

30 TF sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which confers 
constitutive, high-level expression in most plant tissues {see, e.g., Odel et al. (1985) Nature 
313:810); the nopaline synthase promoter (An et al. (1988) Plant Physiol 88:547); and the 
octopine synthase promoter (Fromm et al. (1 989) Plant Cell 1 : 977). 
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A variety of plant gene promoters that regulate gene expression in response to 
environmental, hormonal, chemical, developmental signals, and in a tissue-active manner can be 
used for expression of a TF sequence in plants. Choice of a promoter is based largely on the 
phenotype of interest and is determined by such factors as tissue (e.g., seed, fruit, root, pollen, 
5 vascular tissue, flower, carpel, etc.), inducibility (e.g., in response to wounding, heat, cold, 
drought, light, pathogens, etc.), timing, developmental stage, and the like. Numerous known 
promoters have been characterized and can favorable be employed to promote expression of a 
polynucleotide of the invention in a transgenic plant or cell of interest. For example, tissue 
specific promoters include: seed-specific promoters (such as the napin, phaseolin or DC3 

10 promoter described in US Pat. No. 5,773,697), fruit-specific promoters that are active during fruit 
ripening (such as the dru 1 promoter (US Pat. No. 5,783,393), or the 2A1 1 promoter (US Pat. No. 
4,943,674) and the tomato polygalacturonase promoter (Bird et al. (1988) Plant Mol Biol 11:651), 
root-specific promoters, such as those disclosed in US Patent Nos. 5,618,988, 5,837,848 and 
5,905,186, pollen-active promoters such as PTA29, PTA26 and PTA13 (US Pat. No. 5,792,929), 

1 5 promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol Biol 37:977-988), flower- 
specific (Kaiser et al, (1995) Plant Mol Biol 28:231-243), pollen (Baerson et al. (1994) Plant Mol 
Biol 26:1947-1959), carpels (Ohl et al. (1990) Plant Cell 2:837-848), pollen and ovules (Baerson 
et al. (1993) Plant Mol Biol 22:255-267), auxin-inducible promoters (such as that described in 
van der Kop et al. (1 999) Plant Mol Biol 39:979-990 or Baumann et al. (1 999) Plant Cell 1 1 :323- 

20 334), cytokinin-inducible promoter (Guevara-Garcia (1998) Plant Mol Biol 38:743-753), 

promoters responsive to gibberellin (Shi et al. (1998) Plant Mol Biol 38:1053-1060, Willmott et 
al. (1998) 38:817-825) and the like. Additional promoters are those that elicit expression in 
response to heat (Ainley et al. (1993) Plant Mol Biol 22: 13-23), light (e.g., the pea rbcS-3A 
promoter, Kuhlemeier et al. (1989) Plant Cell 1:471, and the maize rbcS promoter, Schaffher and 

25 Sheen (1991) Plant Cell 3: 997); wounding (e.g., wunl, Siebertz et al. (1989) Plant Cell 1 : 961); 
pathogens (such as the PR-1 promoter described in Buchel et al. (1999) Plant Mol. Biol. 40:387- 
396, and the PDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol 38:1071-80), 
and chemicals such as methyl jasmonate or salicylic acid (Gatz et al. (1997) Plant Mol Biol 48: 89- 
108). In addition, the timing of the expression can be controlled by using promoters such as those 

30 acting at senescence (An and Amazon (1995) Science 270: 1 986-1988); or late seed development 
(Odell et al. (1994) Plant Phvsiol 106:447-458). 

Plant expression vectors can also include RN A processing signals that can be 
positioned within, upstream or downstream of the coding sequence. In addition, the expression 
vectors can include additional regulatory sequences from the 3'-untranslated region of plant 

20 
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genes, e.g., a 3' terminator region to increase mRNA stability of the mRNA, such as the Pl-II 
terminator region of potato or the octopine or nopal ine synthase 3' terminator regions. 

Additional Expression Elements 

Specific initiation signals can aid in efficient translation of coding sequences. 
5 These signals can include, e.g., the ATG initiation codon and adjacent sequences. In cases where 
a coding sequence, its initiation codon and upstream sequences are inserted into the appropriate 
expression vector, no additional translational control signals may be needed. However, in cases 
where only coding sequence (e.g., a mature protein coding sequence), or a portion thereof, is 
inserted, exogenous transcriptional control signals including the ATG initiation codon can be 
1 0 separately provided. The initiation codon is provided in the correct reading frame to facilitate 

transcription. Exogenous transcriptional elements and initiation codons can be of various origins, 
both natural and synthetic. The efficiency of expression can be enhanced by the inclusion of 
enhancers appropriate to the cell system in use. 

Expression Hosts 

1 5 The present invention also relates to host cells which are transduced with vectors 

of the invention, and the production of polypeptides of the invention (including fragments 
thereof) by recombinant techniques. Host cells are genetically engineered (i.e, nucleic acids are 
introduced, e.g., transduced, transformed or transfected) with the vectors of this invention, which 
may be, for example, a cloning, vector or an expression vector comprising the relevant nucleic 

20 acids herein. The vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acids, 
etc. The engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants, or amplifying the relevant gene. 
The culture conditions, such as temperature, pH and the like, are those previously used with the 
host cell selected for expression, and will be apparent to those skilled in the art and in the 

25 references cited herein, including, Sambrook and Ausubel. 

The host cell can be a eukaryotic cell, such as a yeast cell, or a plant cell, or the 
host cell can be a prokaryotic cell, such as a bacterial cell. Plant protoplasts are also suitable for 
some applications. For example, the DNA fragments are introduced into plant tissues, cultured 
plant cells or plant protoplasts by standard methods including electroporation (Fromm et al., 

30 (1985) Proc. Natl. Acad. Sci. USA 82, 5824, infection by viral vectors such as cauliflower mosaic 
virus (CaMV) (Hohn et al., (1982) Molecular Biology of Plant Tumors , (Academic Press, New 
York) pp. 549-560; US 4,407,956), high velocity ballistic penetration by small particles with the 
nucleic acid either within the matrix of small beads or particles, or on the surface (Klein et al., 
(1987) Nature 327, 70-73), use of pollen as vector (WO 85/01856), or use of Agrobacterium 
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tumefaciens or A. rhizogenes carrying a T-DNA plasmid in which DNA fragments are cloned. 
The T-DNA plasmid is transmitted to plant cells upon infection by Agrobacterium tumefaciens, 
and a portion is stably integrated into the plant genome (Horsch et al. (1984) Science 233:496- 
498; Fraley et al. (1983) Proc. Natl. Acad. Sci. USA 80, 4803). 
5 The cell can include a nucleic acid of the invention which encodes a polypeptide, 

wherein the cells expresses a polypeptide of the invention. The cell can also include vector 
sequences, or the like. Furthermore, cells and transgenic plants which include any polypeptide or 
nucleic acid above or throughout this specification, e.g., produced by transduction of a vector of 
the invention, are an additional feature of the invention. 

10 For long-term, high-yield production of recombinant proteins, stable expression 

can be used. Host cells transformed with a nucleotide sequence encoding a polypeptide of the 
invention are optionally cultured under conditions suitable for the expression and recovery of the 
encoded protein from cell culture. The protein or fragment thereof produced by a recombinant 
cell may be secreted, membrane-bound, or contained intracellularly, depending on the sequence 

1 5 and/or the vector used. As will be understood by those of skill in the art, expression vectors 

containing polynucleotides encoding mature proteins of the invention can be designed with signal 
sequences which direct secretion of the mature polypeptides through a prokaryotic or eukaryotic 
cell membrane. 

Modified Amino Acids ... 
20 Polypeptides of the invention may contain one or more modified amino acids. 

The presence of modified amino acids may be advantageous in, for example, increasing 

polypeptide half-life, reducing polypeptide antigenicity or toxicity, increasing polypeptide storage 

stability, or the like. Amino acid(s) are modified, for example, co-translationally or post- 

translationally during recombinant production or modified by synthetic or chemical means. 

25 Non-limiting examples of a modified amino acid include incorporation or other 

use of acetylated amino acids, glycosylated amino acids, sulfated amino acids, prenylated (e.g., 
famesylated, geranylgeranylated) amino acids, PEG modified (e.g., "PEGylated") amino acids, 
biotinylated amino acids, carboxylated amino acids, phosphorylated amino acids, etc. References 
adequate to guide one of skill in the modification of amino acids are replete throughout the 

30 literature. 

IDENTIFICATION OF ADDITIONAL FACTORS 

A transcription factor provided by the present invention can also be used to 
identify additional endogenous or exogenous molecules that can affect a phentoype or trait of 
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interest. On the one hand, such molecules include organic (small or large molecules) and/or 
inorganic compounds that affect expression of (i.e., regulate) a particular transcription factor. 
Alternatively, such molecules include endogenous molecules that are acted upon either at a 
transcriptional level by a transcription factor of the invention to modify a phenotype as desired. 
5 For example, the transcription factors can be employed to identify one or more downstream gene 
with which is subject to a regulatory effect of the transcription factor. In one approach, a 
transcription factor or transcription factor homologue of the invention is expressed in a host cell, 
e.g, a transgenic plant cell, tissue or explant, and expression products, either RNA or protein, of 
likely or random targets are monitored, e.g., by hybridization to a microarray of nucleic acid 

10 probes corresponding to genes expressed in a tissue or cell type of interest, by two-dimensional 
gel electrophoresis of protein products, or by any other method known in the art for assessing 
expression of gene products at the level of RNA or protein. Alternatively, a transcription factor 
of the invention can be used to identify promoter sequences (i.e., binding sites) involved in the 
regulation of a downstream target. After identifying a promoter sequence, interactions between 

1 5 the transcription factor and the promoter sequence can be modified by changing specific 

nucleotides in the promoter sequence or specific amino acids in the transcription factor that 
interact with the promoter sequence to alter a plant trait. Typically, transcription factor DNA 
binding sites are identified by gel shift assays. After identifying the promoter regions, the 
promoter region sequences can be employed in double-stranded DNA arrays to identify 

20 molecules that affect the interactions of the transcription factors with their promoters (Bulyk et al. 
(1999) Nature Biotechnology 17:573-577). 

The identified transcription factors are also useful to identify proteins that modify 
the activity of the transcription factor. Such modification can occur by covalent modification, 
such as by phosphorylation, or by protein-protein (homo or-heteropolymer) interactions. Any 

25 method suitable for detecting protein-protein interactions can be employed. Among the methods 
that can be employed are co-immunoprecipitation, cross-linking and co-purification through 
gradients or chromatographic columns, and the two-hybrid yeast system. 

The two-hybrid system detects protein interactions in vivo and is described in 
Chien, et al., (1991), Proc. Natl. Acad. Sci. USA 88, 9578-9582 and is commercially available 

30 from Clontech (Palo Alto, Calif). In such a system, plasmids are constructed that encode two 
hybrid proteins: one consists of the DNA-binding domain of a transcription activator protein 
fused to the TF polypeptide and the other consists of the transcription activator protein's 
activation domain fused to an unknown protein that is encoded by a cDNA that has been 
recombined into the plasmid as part of a cDNA library. The DNA-binding domain fusion plasmid 
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and the cDNA library are transformed into a strain of the yeast Saccharomyces cerevisiae that 
contains a reporter gene (e.g., lacZ) whose regulatory region contains the transcription activator's 
binding site. Either hybrid protein alone cannot activate transcription of the reporter gene. 
Interaction of the two hybrid proteins reconstitutes the functional activator protein and results in 
5 expression of the reporter gene, which is detected by an assay for the reporter gene product. Then, 
the library plasmids responsible for reporter gene expression are isolated and sequenced to 
identify the proteins encoded by the library plasmids. After identifying proteins that interact with 
the transcription factors, assays for compounds that interfere with the TF protein-protein 
interactions can be preformed. 

10 IDENTIFICATION OF MODULATORS 

In addition to the intracellular molecules described above, extracellular 
molecules that alter activity or expression of a transcription factor, either directly or indirectly, 
can be identified/For example, the methods can entail first placing a candidate molecule in 
contact with a plant or plant cell. The molecule can be introduced by topical administration, such 

15 as spraying or soaking of a plant, and then the molecule's effect on the expression or activity of 
the TF polypeptide or the expression of the polynucleotide monitored. Changes in the expression 
of the TF polypeptide can be monitored by use of polyclonal or monoclonal antibodies, gel 
electrophoresis or the like. Changes in the expression of the corresponding polynucleotide 
sequence can be detected by use of microarrays, Northerns, quantitative PCR, or any other 

20 technique for monitoring changes in mRNA expression. These techniques are exemplified in 
Ausubel et al. (eds) Current Protocols in Molecular Biology , John Wiley & Sons (1998). Such 
changes in the expression levels can be correlated with modified plant traits and thus identified 
molecules can be useful for soaking or spraying on fruit, vegetable and grain crops to modify 
traits in plants. 

25 Essentially any available composition can be tested for modulatory activity of 

expression or activity of any nucleic acid or polypeptide herein. Thus, available libraries of 
compounds such as chemicals, polypeptides, nucleic acids and the like can be tested for 
modulatory activity. Often, potential modulator compounds can be dissolved in aqueous or 
organic (e.g., DMSO-based) solutions for easy delivery to the cell or plant of interest in which the 

30 activity of the modulator is to be tested. Optionally, the assays are designed to screen large 

modulator composition libraries by automating the assay steps and providing compounds from 
any convenient source to assays, which are typically run in parallel (e.g., in microtiter formats on 
microtiter plates in robotic assays). 
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In one embodiment, high throughput screening methods involve providing a 
combinatorial library containing a large number of potential compounds (potential modulator 
compounds). Such "combinatorial chemical libraries" are then screened in one or more assays, as 
described herein, to identify those library members (particular chemical species or subclasses) 
that display a desired characteristic activity. The compounds thus identified can serve as target 
compounds. 

A combinatorial chemical library can be, e.g., a collection of diverse chemical 
compounds generated by chemical synthesis or biological synthesis. For example, a 
combinatorial chemical library such as a polypeptide library is formed by combining a set of 
chemical building blocks (e.g., in one example, amino acids) in every possible way for a given 
compound length (i.e., the number of amino acids in a polypeptide compound of a set length). 
Exemplary libraries include peptide libraries, nucleic acid libraries, antibody libraries (see, e.g., 
Vaughn et al. (1 996) Nature Biotechnology . 14(3):309-3 14 and PCT/US96/10287), carbohydrate 
libraries (see, e.g., Liang et al. Science (1996) 274:1520-1522 and U.S. Patent 5,593,853), 
peptide nucleic acid libraries (see, e.g., U.S. Patent 5,539,083), and small organic molecule 
libraries (see, e.g., benzodiazepines, Baum C&EN Jan 1 8, page 33 (1993); isoprenoids, U.S. 
Patent 5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, U.S. 
Patents 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent 5,506,337) and the like. 

Preparation and screening of combinatorial or other libraries is well known to 
those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent 5,010,175, Furka, Int. J. Pept. Prot. Res. 37:487-493 
(1991) and Houghton ct al. Nature 354:84-88 (1991)). Other chemistries for generating chemical 
diversity libraries can also be used. 

In addition, as noted, compound screening equipment for high-throughput 
screening is generally available, e.g., using any of a number of well known robotic systems that 
have also been developed for solution phase chemistries useful in assay systems. These systems 
include automated workstations including an automated synthesis apparatus and robotic systems 
utilizing robotic arms. Any of the above devices are suitable for use with the present invention, 
e.g., for high-throughput screening of potential modulators. The nature and implementation of 
modifications to these devices (if any) so that they can operate as discussed herein will be 
apparent to persons skilled in the relevant art. 

Indeed, entire high throughput screening systems are commercially available. 
These systems typically automate entire procedures including all sample and reagent pipetting, 
liquid dispensing, timed incubations, and final readings of the microplate in detector(s) 
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appropriate for the assay. These configurable systems provide high throughput and rapid start up 
as well as a high degree of flexibility and customization. Similarly, microfluidic implementations 
of screening are also commercially available. 

The manufacturers of such systems provide detailed protocols the various high 
5 throughput. Thus, for example, Zymark Corp. provides technical bulletins describing screening 
systems for detecting the modulation of gene transcription, ligand binding, and the like. The 
integrated systems herein, in addition to providing for sequence alignment and, optionally, 
synthesis of relevant nucleic acids, can include such screening apparatus to identify modulators 
that have an effect on one or more polynucleotides or polypeptides according to the present 
10 invention. 

In some assays it is desirable to have positive controls to ensure that the 
components of the assays are working properly. At least two types of positive controls are 
appropriate. That is, known transcriptional activators or inhibitors can be incubated with 
cells/plants/ etc. in one sample of the assay, and the resulting increase/decrease in transcription 

1 5 can be detected by measuring the resulting increase in RNA/ protein expression, etc., according to 
the methods herein.. It will be appreciated that modulators can also be combined with 
transcriptional activators or inhibitors to find modulators which inhibit transcriptional activation 
or transcriptional repression. Either expression of the nucleic acids and proteins herein or any 
additional nucleic acids or proteins activated by the nucleic acids or proteins herein, or both, can 

20 be monitored. 

In an embodiment, the invention provides a method for identifying compositions 
that modulate the activity or expression of a polynucleotide or polypeptide of the invention. For 
example, a test compound, whether a small or large molecule, is placed in contact with a cell, 
plant (or plant tissue or explant), or composition comprising the polynucleotide or polypeptide of 

25 interest and a resulting effect on the cell, plant, (or tissue or explant) or composition is evaluated 
by monitoring, either directly or indirectly, one or more of: expression level of the polynucleotide 
or polypeptide, activity (or modulation of the activity) of the polynucleotide or polypeptide. In 
some cases, an alteration in a plant phenotype can be detected following contact of a plant (or 
plant cell, or tissue or explant) with the putative modulator, e.g., by modulation of expression or 

30 activity of a polynucleotide or polypeptide of the invention. 



SUBSEQUENCES 

Also contemplated are uses of polynucleotides, also referred to herein as 
oligonucleotides, typically having at least 12 bases, preferably at least 1 5, more preferably at least 
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20, 30, or 50 bases, which hybridize under at least highly stringent (or ultra-high stringent or 
ultra-ultra- high stringent conditions) conditions to a polynucleotide sequence described above. 
The polynucleotides may be used as probes, primers, sense and antisense agents, and the like, 
according to methods as noted supra. 
5 Subsequences of the polynucleotides of the invention, including polynucleotide 

fragments and oligonucleotides are useful as nucleic acid probes and primers. An oligonucleotide 
suitable for use as a probe or primer is at least about 15 nucleotides in length, more often at least 
about 18 nucleotides, often at least about 21 nucleotides, frequently at least about 30 nucleotides, 
or about 40 nucleotides, or more in length. A nucleic acid probe is useful in hybridization 

1 0 protocols, e.g., to identify additional polypeptide homologies of the invention, including 

protocols for microarray experiments. Primers can be annealed to a complementary target DNA 
strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA 
strand, and then extended along the target DNA strand by a DNA polymerase enzyme. Primer 
pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain 

1 5 reaction (PCR) or other nucleic-acid amplification methods. See Sambrook and Ausubel, supra. 

In addition, the invention includes an isolated or recombinant polypeptide 
including a subsequence of at least about 15 contiguous amino acids encoded by the recombinant 
or isolated polynucleotides of the invention. For example, such polypeptides, or domains or 
fragments thereof, can be used as immunogens, e.g., to produce antibodies specific for the 
20 polypeptide sequence, or as probes for detecting a sequence of interest. A subsequence can range 
in size from about 15 amino acids in length up to and including the full length of the polypeptide. 

PRODUCTION OF TRANSGENIC PLANTS 
Modification of Traits 

The polynucleotides of the invention are favorably employed to produce 
25 transgenic plants with various traits, or characteristics, that have been modified in a desirable 

manner, e.g., to improve the seed characteristics of a plant. For example, alteration of expression 
levels or patterns (e.g., spatial or temporal expression patterns) of one or more of the transcription 
factors (or transcription factor homologues) of the invention, as compared with the levels of the 
same protein found in a wild type plant, can be used to modify a plant's traits. An illustrative 
30 example of trait modification, improved seed characteristics, by altering expression levels of a 
particular transcription factor is described further in the Examples and the Sequence Listing. 
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Antisense and Cosuppression Approaches 

In addition to expression of the nucleic acids of the invention as gene 
replacement or plant phenotype modification nucleic acids, the nucleic acids are also useful for 
sense and anti-sense suppression of expression, e.g., to down-regulate expression of a nucleic 
acid of the invention, e.g., as a further mechanism for modulating plant phenotype. That is, the 
nucleic acids of the invention, or subsequences or anti -sense sequences thereof, can be used to 
block expression of naturally occurring homologous nucleic acids. A variety of sense and anti- 
sense technologies are known in the art, e.g., as set forth in Lichtenstein and Nellen (1997) 
Antisense Technology: A Practical Approach IRL Press at Oxford University, Oxford, England. 
In general, sense or anti-sense sequences are introduced into a cell, where they are optionally 
amplified, e.g., by transcription. Such sequences include both simple oligonucleotide sequences 
and catalytic sequences such as ribozymes. . 

For example, a reduction or elimination of expression (i.e., a "knock-out") of a 
transcription factor or transcription factor homologue polypeptide in a transgenic plant, e.g., to 
modify a plant trait, can be obtained by introducing an antisense construct corresponding to the 
polypeptide of interest as a cDNA. For antisense suppression, the transcription factor or homologue 
cDNA is arranged in reverse orientation (with respect to the coding sequence) relative to the 
promoter sequence in the expression vector. The introduced sequence need not be the full length 
cDNA or gene, and need not be identical to the cDNA or gene found in the plant type to be 
transformed. Typically, the antisense sequence need only be capable of hybridizing to the target 
gene or RNA of interest. Thus, where the introduced sequence is of shorter length, a higher 
degree of homology to the endogenous transcription factor sequence will be needed for effective 
antisense suppression. While antisense sequences of various lengths can be utilized, preferably, 
the introduced antisense sequence in the vector will be al least 30 nucleotides in length, and 
improved antisense suppression will typically be observed as the length of the antisense sequence 
increases. Preferably, the length of the antisense sequence in the vector will be greater than 100 
nucleotides. Transcription of an antisense construct as described results in the production of 
RNA molecules that are the reverse complement of mRNA molecules transcribed from the 
endogenous transcription factor gene in the plant cell. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using a ribozyme. Ribozymes are RNA molecules that possess highly specific 
endoribonuclease activity. The production and use of ribozymes are disclosed in U.S. Patent No. 
4,987,071 and U.S. Patent No. 5,543,508. Synthetic ribozyme sequences including antisense 
RNAs can be used to confer RNA cleaving activity on the antisense RNA, such that endogenous 
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mRNA molecules that hybridize to the antisense RNA are cleaved, which in turn leads to an 
enhanced antisense inhibition of endogenous gene expression. 

. Vectors in which RNA encoded by a transcription factor or transcription factor 
homologue cDNA is over-expressed can also be used to obtain co-suppression of a corresponding 
endogenous gene, e.g., in the manner described in U.S. Patent No. 5,231,020 to Jorgensen. Such 
co-suppression (also termed sense suppression) does not require that the entire transcription factor 
cDNA be introduced into the plant cells, nor does it require that the introduced sequence be 
exactly identical to the endogenous transcription factor gene of interest. However, as with 
antisense suppression, the suppressive efficiency will be enhanced as specificity of hybridization 
is increased, e.g., as the introduced sequence is lengthened, and/or as the sequence similarity 
between the introduced sequence and the endogenous transcription factor gene is increased. 

Vectors expressing an untranslatable form of the transcription factor mRNA, e.g., 
sequences comprising one or more stop codon, or nonsense mutation) can also be used to 
suppress expression of an endogenous transcription factor, thereby reducing or eliminating it's 
activity and modifying one or more traits. Methods for producing such constructs are described 
in U.S. Patent No. 5,583,021. Preferably, such constructs are made by introducing a premature 
stop codon into the transcription factor gene. Alternatively, a plant trait can be modified by gene 
silencing using double-strand RNA (Sharp (1999) Genes and Development 13: 139-141). 

Another method for abolishing the expression of a gerie is by insertion 
mutagenesis using the T-DNA of Agrobacterium tumefaciens. After generating the insertion 
mutants, the mutants can be screened to identify those containing the insertion in a transcription 
factor or transcription factor homologue gene. Plants containing a single transgene insertion 
event at the desired gene can be crossed to generate homozygous plants for the mutation (Koncz 
et al. (1992) Methods in Arabidopsis Research. World Scientific). 

Alternatively, a plant phenotype can be altered by eliminating an endogenous 
gene, such as a transcription factor or transcription factor homologue, e.g., by homologous 
recombination (Kempin et al. (1997) Nature 389:802). 

A plant trait can also be modified by using the crc-lox system (for example, as 
described in US Pat. No. 5,658,772). A plant genome can be modified to include first and 
second lox sites that are then contacted with a Cre recombinase. If the lox sites are in the same 
orientation, the intervening DNA sequence between the two sites is excised. If the lox sites are in 
the opposite orientation, the intervening sequence is inverted. 

The polynucleotides and polypeptides of this invention can also be expressed in a 
plant in the absence of an expression cassette by manipulating the activity or expression level of 
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the endogenous gene by other means. For example, by ectopically expressing a gene by T-DNA 
activation tagging (Ichikawa et al. (1997) Nature 390 698-701; Kakimoto et al. (1996) Science 
274: 982-985). This method entails transforming a plant with a gene tag containing multiple 
transcriptional enhancers and once the tag has inserted into the genome, expression of a flanking 
gene coding sequence becomes deregulated. In another example, the transcriptional machinery in 
a plant can be modified so as to increase transcription levels of a polynucleotide of the invention 
(See, e.g., PCT Publications WO 96/06166 and WO 98/53057 which describe the modification of 
the DNA binding specificity of zinc finger proteins by changing particular amino acids in the 
DNA binding motif). 

The transgenic plant can also include the machinery necessary for expressing or 
altering the activity of a polypeptide encoded by an endogenous gene, for example by altering the 
phosphorylation state of the polypeptide to maintain it in an activated state. 

Transgenic plants (or plant cells, or plant explants, or plant tissues) incorporating 
the polynucleotides of the invention and/or expressing the polypeptides of the invention can be 
produced by a variety of well established techniques as described above. Following construction 
of a vector, most typically an expression cassette, including a polynucleotide, e.g., encoding a 
transcription factor or transcription factor homologue, of the invention, standard techniques can 
be used to introduce the polynucleotide into a plant, a plant cell, a plant explant or a plant tissue 
of interest. Optionally, the plant cell, explant or tissue can be regenerated to produce a transgenic 
20 plant. 

The plant can be any higher plant, including gymnosperms, monocotyledonous 
and dicotyledenous plants. Suitable protocols are available for Leguminosae (alfalfa, soybean, 
clover, etc.), Umbelliferae (carrot, celery, parsnip), Cruciferae (cabbage, radish, rapeseed, 
broccoli, etc.), Curcurbitaceae (melons and cucumber), Gramineae (wheat, com, rice, barley, 
millet, etc.), Solanaceae (potato, tomato, tobacco, peppers, etc.), and various other crops. See 
protocols described in Ammirato et al. (1984) Handbook of Plan t Cell Culmre -Trn P Sjjecjgs 
Macmillan Publ. Co. Shimamoto et al. (1989) N^ture.338:274-276; Fromm et al. (1990) 
Bio/Technology 8:833-839; and Vasil et al. (1 990) Bio/Technolom, 8:429-434. 

Transformation and regeneration of both.monocotyledonous and dicotyledonous 
plant cells is now routine, and the selection of the most appropriate transformation technique will 
be determined by the practitioner. The choice of method will vary with the type of plant to be 
transformed; those skilled in the art will recognize the suitability of particular methods for given 
plant types. Suitable methods can include, but are not limited to: electroporation of plant 
protoplasts; liposome-mediated transformation; polyethylene glycol (PEG) mediated 
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transformation; transformation using viruses; micro-injection of plant cells; micro-projectile 
bombardment of plant cells; vacuum infiltration; and Agrobacierium tumeficiens mediated 
transformation. Transformation means introducing a nucleotide sequence in a plant in a manner to 
cause stable or transient expression of the sequence. 
5 Successful examples of the modification of plant characteristics by 

transformation with cloned sequences which serve to illustrate the current knowledge in this field 
of technology, and which are herein incorporated by reference, include: U.S. Patent Nos. 
5,571,706; 5,677,175; 5,510,471; 5,750,386; 5,597,945; 5,589,615; 5,750,871; 5,268,526; 
5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042. 
10 Following transformation, plants are preferably selected using a dominant 

selectable marker incorporated into the transformation vector. Typically, such a marker will 
confer antibiotic or herbicide resistance on the transformed plants, and selection of transformants 
can be accomplished by exposing the plants to appropriate concentrations of the antibiotic or 
herbicide. 

15 After transformed plants are selected and grown to maturity, those plants 

showing a modified trait are identified. The modified trait can be any of those traits described 
above. Additionally, to confirm that the modified trait is due to changes in expression levels or 
activity of the polypeptide or polynucleotide of the invention can be determined by analyzing 
mRNA expression using Northern blots, RT-PCR or microarrays, or protein expression using 

20 immunoblots or Western blots or gel shift assays. 

INTEGRATED SYSTEMS— SEQUENCE IDENTITY 

Additionally, the present invention may be an integrated system, computer or 
computer readable medium that comprises an instruction set for determining the identity of one or 
more sequences in a database. In addition, the instruction set can be used to generate or identify 
25 sequences that meet any specified criteria. Furthermore, the instruction set may be used to 

associate or link certain functional benefits, such improved seed characteristics, with one or more 
identified sequence. 

For example, the instruction set can include, e.g., a sequence comparison or other 
alignment program, e.g., an available program such as, for example, the Wisconsin Package 
30 Version 10.0, such as BLAST, FASTA, PILEUP, FINDPATTERNS or the like (GCG, Madision, 
WI). Public sequence databases such as GenBank, EMBL, Swiss-Prot and PIR or private 
sequence databases such as PhytoSeq (Incyte Pharmaceuticals, Palo Alto, CA) can be searched. 
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Alignment of sequences for comparison can be conducted by the local homology 
algorithm of Smith and Waterman (1981) Adv. AppL Math. 2:482, by the homology alignment 
algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443, by the search for similarity 
method of Pearson and Lipman (1988) Proc. Natl Acad. Sci. U.S.A . 85: 2444, by computerized 
5 implementations of these algorithms. After alignment, sequence comparisons between two (or 
more) polynucleotides or polypeptides are typically performed by comparing sequences of the 
two sequences over a comparison window to identify and compare local regions of sequence 
similarity. The comparison window can be a segment of at least about 20 contiguous positions, 
usually about 50 to about 200, more usually about 100 to about 150 contiguous positions. A 
1 0 description of the method is provided in Ausubel et ah, supra. 

A variety of methods of determining sequence relationships can be used, 
including manual alignment and computer assisted sequence alignment and analysis. This later 
approach is a preferred approach in the present invention, due to the increased throughput 
afforded by computer assisted methods. As noted above, a variety of computer programs for 
15 performing sequence alignment are available, or can be produced by one of skill. 

One example algorithm that is suitable for determining percent sequence identity 
and sequence similarity is the BLAST algorithm, which is described in Altschul et al. J. Mol. Biol 
215:403-410 (1990). Software for performing BLAST analyses is publicly available, e.g., 
through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 
20 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. 
25 The word hits are then extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 
0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a 
scoring matrix is used to calculate the cumulative score. Extension of the word hits in each 
30 direction are halted when: the cumulative alignment score falls off by the quantity X from its 

maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of 
one or more negative-scoring residue alignments; or the end of either sequence is reached. The 
BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. 
The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an 
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expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino 
acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) 
of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989 ) Proc. Natl. Acad. 
Sci. USA 89:10915). 

In addition to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & 
Altschul ( 1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided 
by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of 
the probability by which a match between two nucleotide or amino acid sequences would occur 
by chance. For example, a nucleic acid is considered similar to a reference sequence (and, 
therefore, in this context, homologous) if the smallest sum probability in a comparison of the test 
nucleic acid to the reference nucleic acid is less than about 0.1, or less than about 0.01, and or 
even less than about 0.001. An additional example of a useful sequence alignment algorithm is 
PILEUP. PILEUP creates a multiple sequence alignment from a group of related sequences using 
progressive, pairwise alignments. The program can align, e.g., up to 300 sequences of a 
maximum length of 5,000 letters. 

The integrated system, or computer typically includes a user input interface 
allowing a user to selectively view one or more sequence records corresponding to the one or 
more character strings, as well as an instruction set which aligns the one or more character strings 
with each other or with an additional character string to identify one or more region of sequence 
similarity. The system may include a link of one or more character strings with a particular 
phenotype or gene function. Typically, the system includes a user readable output element which 
displays an alignment produced by the alignment instruction set. 

The methods of this invention can be implemented in a localized or distributed 
computing environment. In a distributed environment, the methods may implemented on a single 
computer comprising multiple processors or on a multiplicity of computers. The computers can 
be linked, e.g. through a common bus, but more preferably the computer(s) are nodes on a 
network. The network can be a generalized or a dedicated local or wide-area network and, in 
certain preferred embodiments, the computers may be components of an intra-net or an internet. 

Thus, the invention provides methods for identifying a sequence similar or 
homologous to one or more polynucleotides as noted herein, or one or more target polypeptides 
encoded by the polynucleotides, or otherwise noted herein and may include linking or associating 
a given plant phenotype or gene function with a sequence. In the methods, a sequence database is 
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provided (locally or across an inter or intra net) and a query is made against the sequence 
database using the relevant sequences herein and associated plant phenotypes or gene functions. 

Any sequence herein can be entered into the database, before or after querying 
the database. This provides for both expansion of the database and, if done before the querying 
step, for insertion of control sequences into the database. The control sequences can be detected 
by the query to ensure the general integrity of both the database and the query. As noted, the 
query can be performed using a web browser based interface. For example, the database can be a 
centralized public database such as those noted herein, and the querying can be done from a 
remote terminal or computer across an internet or intranet. 

EXAMPLES 

The following examples are intended to illustrate but not limit the present 

invention. 

EXAMPLE I. FULL LENGTH GENE IDENTIFICATION AND CLONING 

Putative transcription factor sequences (genomic or ESTs) related to known 
transcription factors were identified in the Arabidopsis thaliana GcnBank database using the 
tblasta sequence analysis program using default parameters and a P-value cutoff threshold of -4 
or -5 or lower, depending on die length of the query sequence. Putative transcription factor 
sequence hits were then screened to identify those containing particular sequence strings. If the 
sequence hits contained such sequence strings, the sequences were confirmed as transcription 
factors. 

Alternatively, Arabidopsis thaliana cDNA libraries derived from different tissues 
or treatments, or genomic libraries were screened to identify novel members of a transcription 
family using a low stringency hybridization approach. Probes were synthesized using gene 
specific primers in a standard PCR reaction (annealing temperature 60° C) and labeled with 32 P 
dCTP using the High Prime DNA Labeling Kit (Boehringer Mannheim). Purified radiolabelled 
probes were added to filters immersed in Church hybridization medium (0.5 M NaP0 4 pH 7.0, 
7% SDS, 1 % w/v bovine serum albumin) and hybridized overnight at 60 °C with shaking. Filters 
were washed two times for 45 to 60 minutes with lxSCC, 1% SDS at 60° C. 

To identify additional sequence 5' or 3' of a partial cDNA sequence in a cDNA 
library, 5' and 3' rapid amplification of cDNA ends (RACE) was performed using the Marathon™ 
cDNA amplification kit (Clontech, Palo Alto, CA). Generally, the method entailed first isolating 
poly(A) mRNA, performing first and second strand cDNA synthesis to generate double stranded 
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cDNA, blunting cDNA ends, followed by ligation of the Marathon™ Adaptor to the cDNA to 
form a library of adaptor-ligated ds cDNA. 

Gene-specific primers were designed to be used along with adaptor specific 
primers for both 5' and 3 f RACE reactions. Nested primers, rather than single primers, were used 
5 to increase PCR specificity. Using 5 ' and 3' RACE reactions, 5' and 3' RACE fragments were 
obtained, sequenced and cloned. The process can be repeated until 5' and 3' ends of the full- 
length gene were identified. Then the full-length cDNA was generated by PCR using primers 
specific to 5' and 3' ends of the gene by end-to-end PCR. 

EXAMPLE II. CONSTRUCTION OF EXPRESSION VECTORS 

10 The sequence was amplified from a genomic or cDNA library using primers 

specific to sequences upstream and downstream of the coding region. The expression vector was 
pMEN20 or pMEN65, which are both derived from pMON316 (Sanders et al, 0987 ) Nucleic 
Acids Research 15:1543-58) and contain the CaMV 35S promoter to express transgenes. To 
clone the sequence into the vector, both pMEN20 and the amplified DNA fragment were digested 

15 separately with Sail and NotI restriction enzymes at 37° C for 2 hours. The digestion products 
were subject to electrophoresis in a 0.8% agarose gel and visualized by ethidium bromide 
staining. The DNA fragments containing the sequence and the linearized plasmid were excised 
and purified by using a Qiaquick gel extraction kit (Qiagen, CA). The fragments of interest were 
ligated at a ratio of 3:1 (vector to insert). Ligation reactions using T4 DNA ligase (New England 

20 Biolabs, MA) were carried out at 16° C for 16 hours. The ligated DNAs were transformed into 
competent cells of the E. coli strain DH5alpha by using the heat shock method. The 
transformations were plated on LB plates containing 50 mg/1 kanamycin (Sigma). 

Individual colonies were grown overnight in five milliliters of LB broth 
containing 50 mg/1 kanamycin at 37° C. Plasmid DNA was purified by using Qiaquick Mini 

25 Prep kits (Qiagen, CA). 

EXAMPLE IIL TRANSFORMATION OF A GROBACTERIUM WITH THE EXPRESSION 
VECTOR 

After the plasmid vector containing the gene was constructed, the vector was 
used to transform Agrobacterium tumefaciens cells expressing the gene products. The stock of 
30 Agrobacterium tumefaciens cells for transformation were made as described by Nagel et al. 

(1990) FEMS Microbiol Letts . 67: 325-328. Agrobacterium strain ABI was grown in 250 ml LB 
medium (Sigma) overnight at 28°C with shaking until an absorbance (A 600 ) of 0.5 - 1.0 was 
reached. Cells were harvested by centrifugation at 4,000 x g for 15 min at 4°C. Cells were then 
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resuspended in 250 pi chilled buffer (1 mM HEPES, pH adjusted to 7.0 with KOH). Cells were 
centrifuged again as described above and resuspended in 125 jal chilled buffer Cells were then 
centrifuged and resuspended two more times in the same HEPES buffer as described above at a 
volume of 100 and 750 jil, respectively. Resuspended cells were then distributed into 40 \i\ 
aliquots, quickly frozen in liquid nitrogen, and stored at -80° C. 

Agrobacterium cells were transformed with plasmids prepared as described 
above following the protocol described by Nagel et al. For each DNA construct to be 
transformed, 50 - 100 ng DNA (generally resuspended in 10 mM Tris-HCl, 1 mM EDTA, pH 
8.0) was mixed with 40 \i\ of Agrobacterium cells. The DNA/cell mixture was then transferred to 
a chilled cuvette with a 2mm electrode gap and subject to a 2.5 kV charge dissipated at 25 and 
200 \xF using a Gene Pulser II apparatus (Bio-Rad). After electroporation, cells were 
immediately resuspended in 1.0 ml LB and allowed to recover without antibiotic selection for 2 - 
4 hours at 28° C in a shaking incubator. After recovery, cells were plated onto selective medium 
of LB broth containing 100 ^ig/ml spcctinomycin (Sigma) and incubated for 24-48 hours at 28° C. 
Single colonies were then picked and inoculated in fresh medium. The presence of the plasmid 
construct was verified by PGR amplification and sequence analysis. 

EXAMPLE rV. TRANSFO RMATION OF ARABIDOPSIS PLANTS WITH AGROBACTERIUM 
TUMEFACIENS WITH EXPRESSION VECTOR 

After transformation of Agrobacterium tumefaciens with plasmid vectors 
containing the gene, single Agrobacterium colonies were identified, propagated, and used to 
transform Arabidopsis plants. Briefly, 500 ml cultures of LB medium containing 50 mg/1 
kanamycin were inoculated with the colonies and grown at 28° C with shaking for 2 days until an 
absorbance (A^o) of > 2.0 is reached. Cells were then harvested by centrifugation at 4,000 x g 
for 10 min, and resuspended in infiltration medium (1/2 X Murashige and Skoog salts (Sigma), 1 
X Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose (Sigma), 0.044 nM benzylamino purine 
(Sigma), 200 jiI/L Silwet L-77 (Lehle Seeds) until an absorbance (A 60 o) of 0.8 was reached. 

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia) were 
sown at a density of -10 plants per 4" pot onto Pro-Mix BX potting medium (Hummert 
International) covered with fiberglass mesh (18 mm X 16 mm). Plants were grown under 
continuous illumination (50-75 nE/m 2 /sec) at 22-23° C with 65-70% relative humidity. After 
about 4 weeks, primary inflorescence stems (bolts) are cut off to encourage growth of multiple 
secondary bolts. After flowering of the mature secondary bolts, plants were prepared for 
transformation by removal of all siliques and opened flowers. 
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The pots were then immersed upside down in the mixture of Agrobacterium 
infiltration medium as described above for 30 sec, and placed on their sides to allow draining into 
a T x 2' flat surface covered with plastic wrap. After 24 h, the plastic wrap was removed and 
pots are turned upright. The immersion procedure was repeated one week later, for a total of two 
5 immersions per pot. Seeds were then collected from each transformation pot and analyzed 
following the protocol described below. 

EXAMPLE V. IDENTIFICATION OF ARABIDOPS1S PRIMARY TRANSFORM ANTS 
Seeds collected from the transformation pots were sterilized essentially as 
follows. Seeds were dispersed into in a solution containing 0.1% (v/v) Triton X-100 (Sigma) and 

10 sterile H 2 0 and washed by shaking the suspension for 20 min. The wash solution was then 

drained and replaced with fresh wash solution to wash the seeds for 20 miri with shaking. After 
removal of the second wash solution, a solution containing 0.1% (v/v) Triton X-100 and 70% 
ethanol (Equistar) was added to the seeds and the suspension was shaken for 5 min. After 
removal of the ethanol/detergent solution, a solution containing 0. 1% (v/v) Triton X-100 and 30% 

15 (v/v) bleach (Clorox) was added to the seeds, and the suspension was shaken for 10 min. After 
removal of the bleach/detergent solution, seeds were then washed five times in sterile distilled 
H 2 0. The seeds were stored in the last wash water at 4°C for 2 days in the dark before being 
plated onto antibiotic selection medium (1 X Murashige and Skoog salts (pH adjusted to 5.7 with 
1M KOH), 1 X Gamborg's B-5 vitamins, 0.9% phytagar (Life Technologies), and 50 mg/1 

20 kanamycin). Seeds were germinated under continuous illumination (50-75 uE/m 2 /sec) at 22-23° 
C. After 7-10 days of growth under these conditions, kanamycin resistant primary transformants 
(T| generation) were visible and obtained. These seedlings were transferred first to fresh 
selection plates where the seedlings continued to grow for 3-5 more days, and then to soil (Pro- 
Mix BX potting medium). 

25 Primary transformants were crossed and progeny seeds (T 2 ) collected; kanamycin 

resistant seedlings were selected and analyzed. The expression levels of the recombinant 
polynucleotides in the transformants varies from about a 5% expression level increase to a least a 
100% expression level increase. Similar observations are made with respect to polypeptide level 
expression. 

30 
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EXAMPLE VI. IDENTIFICATION OF ARABIDOPSIS PLANTS WITH TRANSCRIPTION 
FACTOR GENE KNOCKOUTS 

The screening of insertion mutagenized Arabidopsis collections for null mutants 
in a known target gene was essentially as described in Krysan et al (1999) Plant Cell 1 1 :2283- 
2290. Briefly, gene-specific primers, nested by 5-250 pb to each others, were designed from the 
5' and 3* regions of a known target gene. Similarly, nested sets of primers were also created 
specific to each of the T-DNA or transposon ends (the "right" and "left" borders). All possible 
combinations of gene specific and T-DNA/transposon primers were used to detect by PCR an 
insertion event within or close to the target gene. The amplified DNA fragments were then 
sequenced which allows the precise determination of the T-DNA/transposon insertion point 
relative to the target gene. Insertion events within the coding or intervening sequence of the 
genes were deconvoluted from a pool comprising a plurality of insertion events to a single unique 
mutant plant for functional characterization. The method is described in more detail in Yu and 
Adam, US Application Serial No. 09/177,733 filed October 23, 1998. 

EXAMPLE VII. IDENTIFICATION OF SEED CHARACTERISTICS PHENOTYPE IN 
OVEREXPRESSOR OR GENE KNOCKOUT PLANTS 

Experiments were performed to identify those transformants or knockouts that exhibited 
an improved seed characteristics. For such studies, the transformants were observed by eye or 
biochemical assays were performed. 

Among the biochemicals that were assayed were insoluble sugars, such as arabinose, 
fucose, galactose,. mannose, rhamnose or xylose or the like; prenyl lipids, such as lutein, beta- 
carotene, xanthophyll-1 , xanthophyll-2, chlorophylls A or B, or alpha-, delta- or gamma- 
tocopherol or the like; fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic acid), 18:0 
(stearic acid), 18:1 (oleic acid), 18:2 (linoleic acid), 20:0 , 18:3 (linolenic acid), 20:1 (eicosenoic 
acid), 20:2, 22:1 (erucic acid) or the like; waxes, such as by altering the levels of C29, C31, or 
C33 alkanes; sterols, such as brassicasterol, campesterol, stigmasterol, sitosterol or stigmastanol 
or the like, glucosinolates, protein or oil levels 

Fatty acids were measured using two methods depending on whether the tissue was from 
leaves or seeds. For leaves, lipids were extracted and esterified with hot methanolic H2S04 and 
partitioned into hexane from methanolic brine. For seed fatty acids, seeds were pulverized and 
extracted in methanol:heptane:toluene:2,2-dimethoxypropane:H2S04 (39:34:20:5:2) for 90 
minutes at 80°C. After cooling to room temperature the upper phase, containing the seed fatty 
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acid esters, was subjected to GC analysis. Fatty acid esters from both seed and leaf tissues were 
analyzed with a Supelco SP-2330 column. 

Glucosinolates were purified from seeds or leaves by first heating the tissue at 95°C for 
10 minutes. Preheated ethanol:water (50:50) is and after heating at 95°C for a further 10 minutes, 
the extraction solvent is applied to a DEAE Sephadex column which had been previously 
equilibrated with 0.5 M pyridine acetate. Desulfoglucosinolates were eluted with 300 ul water 
and analyzed by reverse phase HPLC monitoring at 226 nm. 

For wax alkanes, samples were extracted using an identical method as fatty acids and 
extracts were analyzed on a HP 5890 GC coupled with a 5973 MSD. Samples were 
chromatographed on a J&W DB35 mass spectrometer (J&W Scientific). 

To measure prenyl lipids levels, seeds or leaves were pulverized with 1 to 2% pyrogallol 
as an antioxidant. For seeds, extracted samples were filtered and a portion removed for 
tocopherol and carotenoid/chlorophyll analysis by HPLC. The remaining material was saponified 
for sterol determination. For leaves, an aliquot was removed and diluted with methanol and 
chlorophyll A, chlorophyll B, and total carotenoids measured by spectrophotometry by 
determining absorbance at 665.2 nm, 652.5 nm, and 470 nm. An aliquot was removed for 
tocopherol and carotenoid/chlorophyll composition by HPLC using a Waters uBondapak CI 8 
column (4.6.mm x 150 mm). The remaining methanolic solution was saponified with 10% KOH 
at 80°C for one hour. The samples were cooled and diluted with a mixture of methanol and 
water. A solution of 2% methylene chloride in hexane was mixed in and the samples were 
centrifuged. The aqueous methanol phase was again re-extracted 2% methylene chloride in 
hexane and, after centrifugation, the two upper phases were combined and evaporated. 2% 
methylene chloride in hexane was added to the tubes and the samples were then extracted with 
one ml of water. The upper phase was removed, dried, and resuspended in 400 ul of 2% 
methylene chloride in hexane and analyzed by gas chromatography using a 50 m DB-5ms (0.25 
mm ED, 0.25 urn phase, J&W Scientific). 

Insoluble sugar levels were measured by the method essentially described by Reiter et al., 
Plant Journal 12:335-345. This method analyzes the neutral sugar composition of cell wall 
polymers found in Arabidopsis leaves. Soluble sugars were separated from sugar polymers by 
extracting leaves with hot 70% ethanol. The remaining residue containing the insoluble 
polysaccharides was then acid hydrolyzed with allose added as an internal standard. Sugar 
monomers generated by the hydrolysis were then reduced to the corresponding alditols by 
treatment with NaBH4, then were acetylated to generate the volatile alditol acetates which were 
then analyzed by GC-FID. Identity of the peaks was determined by comparing the retention times 

39 



01357P7A1 \ > 



WO 01/35727 



PCT/US00/31457 



of known sugars converted to the corresponding alditol acetates with the retention times of peaks 
from wild-type plant extracts. Alditol acetates were analyzed on a Supelco SP-2330 capillary 
column (30 m x 250 um x 0.2 um) using a temperature program beginning at 1 80° C for 2 
minutes followed by an increase to 220° C in 4 minutes. After holding at 220° C for 10 minutes, 
5 the oven temperature is increased to 240° C in 2 minutes and held at this temperature for 10 
minutes and brought back to room temperature. 

To identify plants with alterations in total seed oil or protein content, 1 50mg of seeds 
from T2 progeny plants were subjected to analysis by Near Infrared Reflectance (NIR) using a 
Foss NirSystems Model 6500 with a spinning cup transport system. 
I 0 Table 3 shows the phenotypes observed for particular overexpressor or knockout 

plants and provides the SEQ ID No., the internal reference code (GDD), whether a knockout or 
overexpressor plant was analyzed and the observed phenotype. 



Table 3 



GIDs 


-Knockout (KO) or 
overexpressor (OE) 


Phenotype observed 

_. 


G214 


OE 


Up to 1 1 1% increase in seed lutein 


G226 


OE 


Up to 1 7% increase in seed protein content 


G229 


OE 


Up to 1 1% increase in seed oil, 1 3% decrease in seed protein 


G241 


OE 


Up to 1 3% decrease in seed oil 


G464 


OE 


Up to 12% decrease in seed oil, 25% increase in seed protein 


G663 


OE 


Up to 16% decrease in seed oil, 14% increase in seed protein 


G776 


OE 


Up to 3 1 % alteration in some seed fatty acids, including 


G778 


OE 


Up to 32% increase in seed 1 8: 1 fatty acid 


G865 


OE 


Up to 39% increase seed protein; 23% increase in seed oil 


G869 


OE 


Up to 25% alteration in some seed fatty acids 


| G883 


OE 


Up to 47% decrease in seed lutein ' 


G938 


OE 


Up to 1 15% increase in some seed fatty acids 


G1328 


OE 


Up to 43% decrease in seed lutein 


G584 


OE 


Larger seeds 


G668 


OE 


Reduced seed color 
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For a particular overexpressor that shows a less beneficial seed characteristic, it may be 
more useful to select a plant with a decreased expression of the particular transcription factor. For 
a particular knockout that shows a less beneficial seed characteristic, it may be more useful to 
select a plant with an increased expression of the particular transcription factor. 

5 EXAMPLE VIII. IDENTIFICATION OF HOMOLOGOUS SEQUENCES 

Homologous sequences from Arabidopsis and plant species other than Arabidopsis were 
identified using database sequence search tools, such as the Basic Local Alignment Search Tool 
(BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucl. Acid 
Res. 25: 3389-3402). The tblastx sequence analysis programs were employed using the 
1 0 BLOSUM-62 scoring matrix (Henikoff, S. and Henikoff, J. G. ( 1 992) Proc. Natl. Acad. Sci. USA 
89: 10915-10919). 

Identified Arabidopsis homologous sequences are provided in Figure 2 and included in 
the Sequence Listing. The percent sequence identity among these sequences is as low as 47% 
sequence identity. Additionally, the entire NCBI GenBank database was filtered for sequences 

15 from all plants except Arabidopsis thaliana by selecting all entries in the NCBI GenBank 

database associated with NCBI taxonomic ID 33090 (Viridiplantae; all plants) and excluding 
entries associated with taxonomic ID 3701 (Arabidopsis thaliana). These sequences were 
compared to sequences representing genes of SEQ IDs Nos. 1 -54 on 9/26/2000 using the 
Washington University TBLASTX algorithm (version 2.0al 9MP). For each gene of SEQ IDs 

20 Nos. 1-54, individual comparisons were ordered by probability score (P- value), where the score 
reflects the probability that a particular alignment occurred by chance. For example, a score of 
3.6e-40 is 3.6 x 10" 40 . For up to ten species, the gene with the lowest P- value (and therefore the 
most likely homolog) is listed in Figure 3. 

In addition to P-values, comparisons were also scored by percentage identity. Percentage 

25 identity reflects the degree to which two segments of DNA or protein are identical over a 

particular length. The ranges of percent identity between the non-Arabidopsis genes shown in 
Figure 3 and the Arabidopsis genes in the sequence listing are: SEQ ID No. 1: 38%-89%; SEQ ID 
No. 3: 50%-69%; SEQ ID No. 5: 68%-93%; SEQ ID No. 7: 69%-84%; SEQ ID No. 9: 34%-60%; 
SEQ ID No. 1 1: 52%-81%; SEQ ID No. 13: 48%-81%; SEQ ID No. 15: 37%-80%; SEQ ID No. 

30 17: 48%-83%; SEQ ID No. 19: 31%-68%; SEQ ID No. 21 : 47%-90%; SEQ ID No. 23: 57%- 

88%; SEQ ID No. 25: 39%-79%; SEQ ED No. 27: 35%-84%; SEQ ID No. 29: 54%-89%; SEQ ID 
No. 31: 42%-88%; SEQ ID No. 33: 41%-75%; SEQ ID No. 35: 34%-67%; SEQ ID No. 37: 72%- 
86%; SEQ ID No. 39: 39%-84%; SEQ ID No. 41: 40%-58%; SEQ ID No. 43: 44%-82%; SEQ ID 
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No. 45: 54%-68%; SEQ ID No. 47: 48%-64%; SEQ ID No. 49: 46%-88%; SEQ ID No. 5 1 : 52%- 
92%; and SEQ ID No. 53: 48%-80%. 

The polynucleotides and polypeptides in the Sequence Listing and the identified 
homologous sequences may be stored in a computer system and have associated or linked with 
5 the sequences a function, such as that the polynucleotides and polypeptides are usefiil for 
modifying the seed characteristics of a plant. 

All references, publications, patents and other documents herein are incorporated by 
reference in their entirety for all purposes. Although the invention has been described with 
1 0 reference to the embodiments and examples above, it should be understood that various 
modifications can be made without departing from the spirit of the invention. 
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What is claimed is: 

1 . A transgenic plant with modified seed characteristics, which plant comprises a 
recombinant polynucleotide comprising a nucleotide sequence selected from the group consisting 
of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected from 
SEQ ID Nos. 2N, where N=l-27, or a complementary nucleotide sequence thereof; 

(b) a nucleotide sequence encoding a polypeptide comprising a conservatively substituted 
variant of a polypeptide of (a); 

(c) a nucleotide sequence comprising a sequence selected from those of SEQ ID Nos. 2N- 
l, where N=l-27, or a complementary nucleotide sequence thereof; 

(d) a nucleotide sequence comprising silent substitutions in a nucleotide sequence of (c); 

(e) a nucleotide sequence which hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a), (b), (c), or (d); 

(f) a nucleotide sequence comprising at least 15 consecutive nucleotides of a sequence of 
any of (a)-(e); 

(g) a nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide that modifies a plant's seed 
characteristics; 

(h) a nucleotide sequence having at least 3 1 % sequence identity to a nucleotide sequence 
of any of(a)-(g); 

(i) a nucleotide sequence having at least 60% identity sequence identity to a nucleotide 
sequence of any of (a)-(g); 

(j) a nucleotide sequence which encodes a polypeptide having at least 31% identity 

sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-27; 

(k) a nucleotide sequence which encodes a polypeptide having at least 60% identity 

sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-27; and 

(1) a nucleotide sequence which encodes a polypeptide having at least 65% sequence 

identity to a conserved domain of a polypeptide of SEQ ID Nos. 2N, where N=l-27. 

2. The transgenic plant of claim 1, further comprising a constitutive, inducible, or tissue- 
active promoter operably linked to said nucleotide sequence. 

3. The transgenic plant of claim 1, wherein the plant is selected from the group consisting 
of: soybean, wheat, com, potato, cotton, rice, oilseed rape, sunflower, alfalfa, sugarcane, turf, 
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banana, blackberry, blueberry, strawberry, raspberry, cantaloupe, carrot, cauliflower, coffee, 
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, and 
vegetable brassicas. 

4. An isolated or recombinant polynucleotide comprising a nucleotide sequence selected 
from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected from 
SEQ ID Nos. 2N, where N=l-27, or a complementary nucleotide sequence thereof; 

(b) a nucleotide sequence encoding a polypeptide comprising a conservatively substituted 
variant of a polypeptide of (a); 

(c) a nucleotide sequence comprising a sequence selected from those of SEQ ID Nos. 2N- 
1, where N=l-27, or a complementary nucleotide sequence thereof; 

(d) a nucleotide sequence comprising silent substitutions in a nucleotide sequence of (c); 

(e) a nucleotide sequence which hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a), (b), (c), or (d); 

(f) a nucleotide sequence comprising at least 15 consecutive nucleotides of a sequence of 
anyof(a)-(e); 

(g) a nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide that modifies a plant's seed 
characteristics; 

(h) a nucleotide sequence having at least 31% sequence identity to a nucleotide sequence 
of any of (a)-(g); 

(i) a nucleotide sequence having at least 60% identity sequence identity to a nucleotide 
sequence of any of (a)-(g); 

(j) a nucleotide sequence which encodes a polypeptide having at least 31% identity 

sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l -27; 

(k) a nucleotide sequence which encodes a polypeptide having at least 60% identity 

sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-27; and 

(1) a nucleotide sequence which encodes a conserved domain of a polypeptide having at 

least 65% sequence identity to a conserved domain of a polypeptide of SEQ ID Nos. 2N, 

where N=l-27. 



0135727A1 I > 



WO 01/35727 



PCT/US00/31457 



5. The isolated or recombinant polynucleotide of claim 4, further comprising a constitutive, 
inducible, or tissue-active promoter operably linked to the nucleotide sequence. 

6. A cloning or expression vector comprising the isolated or recombinant polynucleotide of 
claim 4. 

7. A cell comprising the cloning or expression vector of claim 6. 

8. A transgenic plant comprising the isolated or recombinant polynucleotide of claim 4. 

9. A composition produced by one or more of: 

(a) incubating one or more polynucleotide of claim 4 with a nuclease; 

(b) incubating one or more polynucleotide of claim 4 with a restriction enzyme; 

(c) incubating one or more polynucleotide of claim 4 with a polymerase; 

(d) incubating one or more polynucleotide of claim 4 with a polymerase and a primer; 

(e) incubating one or more polynucleotide of claim 4 with a cloning vector, or 

(f) incubating one or more polynucleotide of claim 4 with a cell. 

10. A composition comprising two or more different polynucleotides of claim 4. 

11. An isolated or recombinant polypeptide comprising a subsequence of at least about 1 5 
contiguous amino acids encoded by the recombinant or isolated polynucleotide of claim 4. 

12. A plant ectopically expressing an isolated polypeptide of claim 1 1 . 

13. A method for producing a plant having a modified seed characteristics, the method 
comprising altering the expression of the isolated or recombinant polynucleotide of claim 4 or the 
expression levels or activity of a polypeptide of claim 1 1 in a plant, thereby producing a modified 
plant, and selecting the modified plant for improved seed characteristics thereby providing the 
modified plant with a modified seed characteristics. 

14. The method of claim 13, wherein the polynucleotide is a polynucleotide of claim 4. 



45 



WO 01/35727 



PCT/US00/31457 



15. A method of identifying a factor that is modulated by or interacts with a polypeptide 
encoded by a polynucleotide of claim 4, the method comprising: 

(a) expressing a polypeptide encoded by the polynucleotide in a plant; and 

(b) identifying at least one factor that is modulated by or interacts with the polypeptide. 

5 

16. The method of claim 15, wherein the identifying is performed by detecting binding by the 
polypeptide to a promoter sequence, or detecting interactions between an additional protein and 
the polypeptide in a yeast two hybrid system. 

10 17. The method of claim 15, wherein the identifying is performed by detecting expression of 
a factor by hybridization to a microarray, subtractive hybridization or differential display. 

1 8. A method of identifying a molecule that modulates activity or expression of a 
polynucleotide or polypeptide of interest, the method comprising: 

1 5 (a) placing the molecule in contact with a plant comprising the polynucleotide or 

polypeptide encoded by the polynucleotide of claim 4; and, 
(b) monitoring one or more of: 

(i) expression level of the polynucleotide in the plant; 

(ii) expression level of the polypeptide in the plant; 

20 (iii) modulation of an activity of the polypeptide in the plant; or 

(iv) modulation of an activity of the polynucleotide in the plant. 

19. An integrated system, computer or computer readable medium comprising one or more 
character strings corresponding to a polynucleotide of claim 4, or to a polypeptide encoded by the 

25 polynucleotide. 

20. The integrated system, computer or computer readable medium of claim 19, further 
comprising a link between said one or more sequence strings to a modified plant seed 
characteristics phenotype. 

30 

21 . A method of identifying a sequence similar or homologous to one or more 
polynucleotides of claim 4, or one or more polypeptides encoded by the polynucleotides, the 
method comprising: 

(a) providing a sequence database; and, 
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(b) querying the sequence database with one or more target sequences corresponding to 
the one or more polynucleotides or to the one or more polypeptides to identify one or 
more sequence members of the database that display sequence similarity or homology to 
one or more of the one or more target sequences. 

22. The method of claim 2 1 , wherein the querying comprises aligning one or more of the 
target sequences with one or more of the one or more sequence members in the sequence 
database. 

23. The method of claim 21, wherein the querying comprises identifying one or more of the 
one or more sequence members of the database that meet a user-selected identity criteria with one 
or more of the target sequences. 

24. The method of claim 2 1 , further comprising linking the one or more of the 
polynucleotides of claim 4, or encoded polypeptides, to a modified plant seed characteristics 
phenotype. 

25. A plant comprising altered expression levels of an isolated or recombinant polynucleotide 
of claim 4. 

26. A plant comprising altered expression levels or the activity of an isolated or recombinant 
polypeptide of claim 1 1 . 

27. A plant lacking a nucleotide sequence encoding a polypeptide of claim 1 1 . 
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Figure 1 



SEQ ID No 


GID 


cDNA or orotein 


conserved domain 


1 


oZ I 4 


CUINA 




Z 


14 


proiein 


99 7*1 
ZZ-/1 


O 

O 




CUINA 




*f 




proiem 


9ft 70. 
Zo-f O 


O 


oZZy 


CUINA 




D 


ozzy 


proiem 


1 vt 1 9fl 
14-1 ZU 


7 








Q 
O 


f5941 


nrntoin 

pi uitJin 


I 4- I 1 4 


Q 




rHMA 

CUINA 




1 n 




proiein 


7 4c 7A on HOC ICQ 1Q1 o<iO 

f-\ 0, /U-oU, 1 Zo- i Do, 1 oo-Zl y 


1 1 


oDDO 


CUINA 




1 o 

1Z 




protein 


y-i 1 1 


1 1 

1 O 


of (u 


CUINA 




14 


fi77£ 
or fO 


proiein 


97 i 7t; 


1 0 


H77P. 


CUINA 




ID 


f277fl 

VJUO 


protein 


ZZU-ZO f 


1 ( 


oOOO 


CUINA 




1 A 
I o 


OODO 


proiein 


oO- I u J 


1Q 




rHMA 
CUINA 




9n 

ZU | 


oooy 


protein 


1 HQ 177 


91 
Z I 




CUINA 




99 
ZZ 


OOOO 


proiein 


9/1 c: mo 




U3JO 


rHNJA 




24 


G938 


protein 


96-104 


i 25 


G1328 


cDNA 




26 


G1328 


protein 


14-119 


27 


G584 


cDNA 




I 28 


G584 


protein 


401-494 


29 


G668 


cDNA 




30 


G668 


protein 


13-113 
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Figure 2 



ScQ ID NO. 


olU 


nomoiog 


f*nNA or nrntpin 


pon^prvpd domain 


31 


G680 


homolog of G214 


CDNA 




32 


G680 


homolog of G214 


protein 




33 


G682 


homolog of G226 


cDNA 




34 


G682 


homolog of G226 


protein 


22-53 


35 


G225 


homolog of G226 


cDNA 




36 


G225 


homolog of G226 


protein 


39-76 


37 


G678 


homolog of G229 


cDNA _^ 




38 


G678 


homolog of G229 


protein 


•4 A 4HC 

14-115 


39 


G233 


homolog of G241 


CDNA 




40 


G233 


homolog of G241 


protein 


4 A 4 «4 A 

14-1 14 


41 


G463 


homolog of G464 


cDNA 




42 


G463 


homolog of G464 


protein 


4 A OI "77 QQ -ion 1/1C <\QA 007 

14-23, /Y-oo, 1oU-14b, 


43 


G2422 


homolog of G663 


cDNA ! 




44 


G2422 


homolog of G663 


protein 


9-110 


45 


G2421 


homolog of G663 


cDNA 




46 


G2421 


homolog of G663 


protein 


9-110 


47 


bf f Z 


nomoiog oi vjf / o 


rDNA 




48 


G772 


homolog of G776 


protein 


27-176 


49 


G866 


homolog of G883 


cDNA 




50 


G866 


homolog of G883 


protein 


43-300 


51 


G941 


homolog of G938 


cDNA 




52 


G941 


homolog of G938 


protein 


95-103 


53 


G198 


homolog of G 1328 


cDNA 




54 


G198 


homolog of G1328 


protein 


14-117 
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Figure 3A 



SEQ IDs | Gene Ids 


GenbankNIDj P-value 


Species 


1 


G214 


8170933 


8.80E-35 


Lycopersicon esculentum 


1 


G214 


9205339 


1.20E-27 


Glycine max 


1 


G214 


8577344 


1.80E-23 


Zea mays 


1 


G214 


9119112 


2.40E-18 


Medicago truncatula 


1 


G214 


7660673 


4.80E-15 


Sorghum bicolor 


1 


G214 


8213273 


4.40E-14 


Oryza sativa 


1 


G214 


! 3325786 


4.70E-10 


Gossypium hirsutum 


1 


G214 


| 9435251 


1 .50E-09 


Hordeum vulgare 


1 


G214 


9411569 


6.80E-09 


Triticum aestivum 


1 


G214 


7614730 


3.00E-07 


Lotus japonicus 


3 


G226 


4396287 


5.10E-15 


Glycine max 


3 


G226 


9410205 


1.50E-05 


Triticum aestivum 


3 


G226 


3857004 


0.11 


Populus tremula x Populus tremuloides 


3 


G226 


2428139 


0.35 


Oryza sativa 


5 


G229 


7337390 


5.20E-51 


Lycopersicon esculentum 


5 


G229 


j 7244424 


3.90E-50 


Mentha x piperita 


5 


G229 


7776053 


1.30E-49 


Lotus japonicus 


5 


G229 


2921335 


4.60E-48 


Gossypium hirsutum 


5 


G229 


1491932 


3.60E-47 


Zea mays 


5 


G229 


6455590 


2.20E-44 


Glycine max 


5 


G229 


6020191 


1.60E-41 


Pinus taeda 


5 


G229 


7765706 


4.10E-41 


Medicago truncatula 


5 


G229 


7629167 


3.20E-40 


Gossypium arboreum 


5 


G229 


6850206 


4.30E-40 


Oryza sativa 


7 


G241 


6552360 


2.60E-54 


Nicotiana tabacum 


7 


G241 


6782745 . 


2.20E-53 


Oryza sativa 


7 


G241 


8097368 


5.70E-53 


Hordeum vulgare 


7 


G241 


20560 


1.80E-52 


Petunia x hybrida 


7 


G241 


7217727 


2.70E-52 


Sorghum bicolor . 


7 


G241 


5891408 


4.60E-52 


Lycopersicon esculentum 


7 


G241 


5139803 


7.40E-52 


Glycine max 


7 


G241 


7560175 


4.10E-50 


Medicago truncatula 


7 


G241 


8381332 


1 .40E-44 


Gossypium arboreum 


7 


G241 


4886263 


1.20E-42 


Antirrhinum majus 


9 


G464 


6527230 


3.60E-31 


Lycopersicon esculentum 


9 


G464 


9305572 


1.10E-22 


Sorghum bicolor 


9 


G464 


6604917 


6.70E-22 


Medicago truncatula 


9 


G464 


5058123 


2.30E-21 


Glycine max 


9 


G464 


3760881 


1.20E-19 


Oryza sativa 


9 


G464 


5044476 


1.20E-17 


Gossypium hirsutum j 


9 


G464 


9412603 


6.40E-15 


Triticum aestivum 


9 


G464 


7777277 


3.20E-13 


Lotus japonicus 


! 9 


G464 


9410371 


1.70E-11 


Hordeum vulgare 


! 9 


G464 


7624108 


2.10E-10 


Gossypium arboreum 


| 11 


G663 


7673087 


4.10E-43 


Petunia integrifolia 


11 


G663 


7673091 


2.60E-41 


Petunia x hybrida 


11 


G663 


7339148 


1 .30E-39 


Lycopersicon esculentum 


11 


G663 


7673097 


1 .90E-36 


Petunia axillaris 


11 


G663 


5048991 


9.90E-34 


Gossypium hirsutum 


! 11 


G663 


6455590 


2.00E-31 


Glycine max 


I 11 


G663 


7560175 


1.50E-27 


Medicago truncatula 


11 


G663 


7244424 


3.20E-26 


Mentha x piperita 


11 


G663 


6020191 


2.90E-25 


Pinus taeda 
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Figure 3B 



SEQ IDs | Gene Ids | Genbank NID| P-value 


Species 


11 


G663 


4138298 


3.40E-25 


Oryza sativa subsp. indica 


13 


! G776 


8578423 


5.80E-57 


Mesembryanthemum crystallinum 


13 


G776 


7411573 


2.40E-52 


Lycopersicon esculentum 


13 


G776 


9253340 


5.80E-43 


Solarium tuberosum 


13 


G776 


8383411 


6.00E-43 


Euphorbia esula 


13 


G776 


7565426 


1.50E-39 


Medicago truncatula 


13 


I G776 


6666629 


2.50E-33 


Glycine max 


13 


G776 


6732155 


3.60E-33 


Triticum monococcum 


13 


I G776 


7502501 


3.00E-32 


Gossypium arboreum 


13 


G776 


8708684 


3.80E-32 


Hordeum vulgare 


. 13 


G776 


9307772 


2.10E-31 


Sorghum bicolor 


15 


G778 


9258500 


3.10E-36 


Glycine max 


15 


G778 


9211293 


9.40E-21 


Oryza sativa 


15 


G778 


4380303 


7.60E-08 


Lycopersicon esculentum 


15 


G778 


7718953 


4.10E-07 


Medicago truncatula 


. 15 


G778 


7720768 


6.80E-07 


Lotus japonicus 


15 


! G778 


6536575 


8.70E-07 


Zea mays 


15 


G778 


1668906 


0.82 


Citrus sinensis 


17 


G865 


9417297 


1.70E-32 


Triticum aestivum 


17 


G865 


7206394 


4.90E-29 


Medicago truncatula 


17 


G865 


7796858 


5.70E-27 


Glycine max 


I 17 


G865 


4387560 


9.20E-25 


Lycopersicon esculentum 


I 17 


G865 


569065 


1.50E-23 


Oryza sativa 


17 


G865 


7788764 


4.10E-23 


Lotus japonicus 


17 


G865 


790362 


8.40E-22 


Nicotiana tabacum 


17 


G865 


7528275 


5.90E-21 


Mesembryanthemum crystallinum 


17 


G865 


3264766 


8.80E-20 


Prunus armeniaca 


17 


G865 


8098026 


2.00E-19 


Hordeum vulgare 


19 


G869 


2213784 


1.30E-19 


Lycopersicon esculentum 


19 j 


G869 


3065894 


7.30E-19 


Nicotiana tabacum 


19 


G869 


8570080 


4.20E-18 


Oryza sativa 


19 


G869 


7560260 


1.50E-17 


Medicago truncatula 


19 


G869 


7534890 


5.20E-14 


Sorghum bicolor 


19 


G869 


6455322 


1.10E-13 


Glycine max 


19 


G869 


9362061 


2.70E-13 


Triticum aestivum 


19 


G869 


7788764 


5.70E-13 


Lotus japonicus 


19 


G869 


7624302 


2.50E-12 


Gossypium arboreum 


19 


G869 


3858036 


2.80E-12 


Populus balsamifera subsp. trichocarpa 


21 


G883 


4760595 


2.40E-84 


Nicotiana tabacum 


21 


G883 


4894962 


3.50E-45 


Avena sativa 


21 


G883 


6719425 


1.70E-36 


Glycine max 


21 


G883 


5273248 


2.80E-35 


Lycopersicon esculentum 


21 


G883 


9302479 


3.00E-34 


Sorghum bicolor 


21 


G883 | 


6799932 


1.40E-31 


Medicago truncatula 


21 


G883 


5456433 


4.30E-31 


Zea mays 


21 


G883 


8706346 


1 .40E-30 


Hordeum vulgare 


21 


G883 


8404566 


2.70E-30 


Oryza sativa 


21 


G883 


1432055 


2.00E-27 


Petroselinum crispum 


23 


G938 


4239844 


3.10E-180 


Nicotiana tabacum 


23 


G938 


7739794 | 


2.30E-145 


Dianthus caryophyllus 


23 


G938 


7567728 


9.60E-98 


Medicago truncatula 


23 [ 


G938 


8894549 


2J0E-93 


Cicer arietinum 


23 


G938 


8104209 


9.60E-90 


Lycopersicon esculentum 
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1 SEQ IDs | Gene Ids | Genbank NID 


| i -ValUc 




23 


G938 


6462339 


4.60E-79 


Gossypium hirsutum 


23 


G938 


! 9204568 


A OrtP* TO 

1 .20E-78 


Glycine max 


23 


G938 


7720839 


A A C\ I — f> C\ 

1.10E-69 


Lotus japonicus 


23 


G938 


7324903 


1 .60E-52 


Lycopersicon pennellii 


23 


G938 


2427923 


A OAC AT 


Oryza sativa 


25 


G1328 


4383290 


o.iub-oo 


Lycopersicon esculentum 


25 


G1328 


1946266 


1 .oUb-oo 


Oryza sativa 


\ 25 


G1328 


9264503 


A Af\C K*3 

1 .4Ufc-0o 


Glycine max 


25 


G1328 


8381332 


^ inc co 
1 .lub-O^ 


Gossypium arboreum 


25 


G1328 


9363004 


Q OAC AG 


Triticum aestivum 


25 


G1328 


7765706 


A QnC A7 

1 .yub-4 ( 


Medicago truncatula 


25 


G1328 


20562 


o.yOb-47 


Petunia x hybrida 


25 


G1328 


5050757 


4.10E-46 


Gossypium hirsutum 


25 


G1328 


5860031 


-j nr\r~ a c 

7.80E-45 


Pinus taeda 


25 


G1328 


4886263 


c o /*» r~ A A 

5.30E-44 


Antirrhinum majus 


27 


G584 


! 1142618 


o onr a co 

2.30E-153 


Phaseolus vulgaris 


27 


G584 


4321761 


2.40E-128 


Zea mays 


27 


G584 


9280727 


9.70E-122 


Oryza sativa 


27 


G584 


6175251 


4.80E-78 


Lycopersicon esculentum 


27 


G584 


9193975 


2.20E-59 


Medicago truncatula 


27 


G584 


9364538 


1 .40E-53 


Triticum aestivum 


27 


G584 


6847033 


1 JOE-49 


Glycine max 


27 


G584 


5049283 


8.90E-46 


Gossypium hirsutum 


27 


G584 


7781217 


1 .OOE-43 


Lotus japonicus 


27 


G584 


4519200 


1 .20E-27 


Perilla frutescens 


29 


G668 


8172976 


9.70E-73 


Medicago truncatula 


29 


G668 


9252441 


1.10E-70 


Solanum tuberosum 


29 


G668 


5897694 


1 .90E-66 


Lycopersicon esculentum 


29 


G668 


8380712 


7.00E-65 


Gossypium arboreum 


29 


G668 


7685936 


o onn co 

2.20E-58 


Glycine max 


| 29 


G668 


1945280 


4.60E-48 


Oryza sativa 


29 


G668 


20562 


1 .10E-40 


Petunia x hybrida 


29 


G668 


7217727 


o o/"\ r~ ot 

8.20E-37 


Sorghum bicolor 


29 


G668 


6552360 


1 .90E-36 


Nicotiana tabacum 


29 


G668 


4886263 


5.80E-36 


Antirrhinum majus 


31 


G680 


9258166 


5.70E-36 


Glycine max 


31 


G680 


9255178 


3.00E-29 


Zea mays 


31 


G680 


5274804 


1 .20E-27 


Lycopersicon esculentum j 


31 


G680 


4974199 


3.00E-22 


Oryza sativa 


31 


G680 


3325786 


2.10E-21 


Gossypium hirsutum 


31 


G680 


9119112 


1.30E-18 


Medicago truncatula 


I 31 


G680 


7660673 


3.20E-17 


Sorghum bicolor j 


31 


G680 


7243970 


6.10E-16 


Mentha x piperita 


31 


G680 


3858093 


2.10E-10 


Populus balsamifera subsp. trichocarpa 


31 


G680 


8845091 


3.70E-10 


Triticum aestivum 


33 


G682 


309571 


4.40E-08 


Zea mays i 


33 


G682 


4396287 


1.10E-05 


Glycine max 


33 


G682 


3857004 


0.00051 


Populus tremula x Populus tremuloides 


33 


G682 


9410205 


0.00085 


Triticum aestivum 


33 


G682 


8382118 


0.0079 


Gossypium arboreum 


33 


G682 


2428139 


0.017 


Oryza sativa 


33 


G682 


7339148 


0.13 


Lycopersicon esculentum 


33 


G682 


9302672 


0.32 


Sorghum bicolor 
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SEQ IDs | Gene Ids |Genbank NID| P-value 


Species 


33 


G682 


5048991 


0.39 


Gossypium hirsutum 


33 


G682 


6555777 


0.46 


Pinus taeda 


35 


G225 


4396287 


4.40E-16 


Glycine max 


35 


G225 


309571 


0.00029 


Zea mays [ 


35 


G225 


3857004 


0.001 


Populus tremula x Populus tremuloides 


35 


G225 


9410205 


I 0.019 


Triticum aestivum I 


35 


G225 


9426190 


! 0.025 


Triticum turgidum subsp. durum 


35 


G225 


8382118 


0.046 


Gossypium arboreum 


35 


G225 


6782756 


0.27 


Oryza sativa 


35 


G225 


7721017 


0.4 


Lotus japonicus 


35 


I G225 


6020136 


0.47 


Pinus taeda 


35 


G225 


2921331 


0.48 


Gossypium hirsutum 


37 


G678 


7244424 


8.70E-50 


Mentha x piperita 


37 


G678 


7776053 


2.70E-46 


Lotus japonicus 


37 


G678 


7337390 


2.90E-46 


Lycopersicon esculentum 


37 


G678 


2921335 


2.30E-43 


Gossypium hirsutum 


37 


G678 


6455590 


8.30E-43 


Glycine max 


37 


G678 


1491932 


1.60E-42 


Zea mays 


37 


G678 


5860031 


4.80E-40 


Pinus taeda 


37 


G678 


7765706 


3.20E-38 


Medicago truncatula 


37 


G678 


6850206 


8.20E-38 


Oryza sativa 


37 


G678 


7217727 


2.00E-37 


Sorghum bicolor 


39 


G233 


6552360 


6.50E-66 


Nicotiana tabacum 


39 


G233 


20560 


7.60E-65 


Petunia x hybrida 


39 


G233 


5139813 


1.70E-58 


Glycine max 


39 


G233 


5891103 


3.80E-58 


Lycopersicon esculentum 


| 39 


G233 


6782745 


1.80E-52 


Oryza sativa 


39 


G233 


7560175 


1.80E-51 


Medicago truncatula 


39 


G233 


7217727 


8.30E-51 


Sorghum bicolor 


39 


G233 


8097368 


5.80E-49 


Hordeum vulgare 


39 


G233 


8381332 


4.60E-43 


Gossypium arboreum 


39 


G233 


5048991 


3.50E-41 


Gossypium hirsutum 


41 


G463 


6527230 


4.90E-36 


Lycopersicon esculentum 


41 


G463 


9305572 


5.50E-36 


Sorghum bicolor 


41 


G463 


3760881 


1.20E-31 


Oryza sativa 


41 


G463 


6604917 


1.30E-23 


Medicago truncatula 


41 


G463 


5058123 


2.50E-21 


Glycine max 


41 


G463 


5044476 


1.10E-19 


Gossypium hirsutum 


41 


G463 


9412603 , 


1.70E-17 


Triticum aestivum 


41 


G463 


9419394 


6.00E-17 


Hordeum vulgare 


41 


G463 


7624108 


6.20E-17 


Gossypium arboreum 


41 


G463 


8547152 


3.20E-16 


Nicotiana tabacum 


43 


G2422 


7673087 


9.60E-45 


Petunia integrifolia 


43 


G2422 


7339148 


6.30E-43 


Lycopersicon esculentum 


43 


G2422 


7673083 


7.20E-43 


Petunia x hybrida ; 


43 


G2422 


7673097 


3.30E-40 


Petunia axillaris 


43 


G2422 


5048991 


3.30E-36 


Gossypium hirsutum 


i 43 


G2422 


6455590 


3.00E-33 


Glycine max 


| 43 


G2422 


6020191 


3.20E-32 


Pinus taeda 


43 


G2422 


309571 


3.60E-30 


Zea mays 


43 


G2422 


7560832 


9.00E-30 


Medicago truncatula I 


43 


G2422 


9363004 


1.30E-29 


Triticum aestivum 


45 


G2421 


7673087 


1.10E-46 


Petunia integrifolia j 
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j SEQ IDs | Gene Ids 


| Genbank NID P-value 


opecies 


45 


G2421 


5048991 


1.30E-35 


Gossypium hirsutum 


45 


G2421 


7673091 


1 .50E-31 


Petunia x hybrida 


45 


G2421 


! 8380196 


7.30E-31 


Gossypium arboreum 


45 


G2421 


7673095 


1 .90E-30 


Petunia axillaris 


45 


G2421 


7339148 


2.80E-30 


Lycopersicon esculentum 


45 


G2421 


8747182 


9.00E-30 


Medicago truncatula 


45 


G2421 


7217727 


1 .30E-27 


Sorghum bicolor 


45 


G2421 


6073050 


5.50E-27 


Glycine max 


45 


G2421 


1101769 


7.40E-27 


Picea mariana 


47 


G772 


8578423 


4.80E-58 


Mesembryanthemum crystallinum 


47 


G772 


7570276 


3.00E-52 


Medicago truncatula 


47 


G772 


7411573 


1.30E-44 


Lycopersicon esculentum 


47 


G772 


6341483 


6.30E-33 


Glycine max 


47 


G772 


1279639 


2.00E-32 


Petunia x hybrida 


47 


G772 


7722907 


3.50E-32 


Lotus japonicus 


47 


G772 


8405571 


4.70E-32 


Hordeum vulgare 


47 


G772 


6730945 


6.40E-32 


Oryza sativa 


47 


G772 


9302206 


2.50E-31 


Sorghum bicolor 


47 


G772 


| 5047907 


1.10E-30 


Gossypium hirsutum 


49 


G866 


4760595 


3.50E-85 


Nicotiana tabacum 


49 


G866 


4894962 


1 .70E-38 


Avena sativa 


49 


G866 


6719425 


6.60E-35 


Glycine max 


49 


G866 


5273248 


1.10E-33 


Lycopersicon esculentum 


49 


G866 


9302479 


7.40E-33 


Sorghum bicolor 


49 


G866 


6799932 


3.60E-31 


Medicago truncatula 


49 


G866 


4886128 


4.50E-31 


Zea mays 


49 


G866 


8404566 


1.40E-29 


Oryza sativa 


49 


G866 


8706346 


1.10E-28 


Hordeum vulgare 


49 


G866 


1432055 


3.50E-26 


Petroselinum crispum . 


51 


G941 


4239844 


3.80E-198 


Nicotiana tabacum 


51 


G941 I 


7739794 


1.20E-141 


Dianthus caryophyllus 


51 


G941 


7567728 


7.10E-102 


Medicago truncatula | 


51 


G941 


8104209 


3.70E-97 


Lycopersicon esculentum 


51 


G941 I 


8894549 


2.10E-95 


Cicer arietinum 


51 


G941 


5606033 


1 .60E-79 


Glycine max 


51 


G941 


6462339 


4.60E-79 


Gossypium hirsutum 


51 


G941 


7720839 


6.60E-70 


Lotus japonicus 


51 


G941 


7324903 


1.00E-55 


Lycopersicon pennellii 


51 


G941 


2427923 


6.90E-47 


Oryza sativa 


53 


G198 


4383290 


3.50E-64 


Lycopersicon esculentum 


53 


G198 


1946266 


1.10E-58 


Oryza sativa 


53 


G198 


9363004 


5.40E-51 


Triticum aestivum 


53 


G198 


8381332 


6.40E-51 


Gossypium arboreum 


53 


G198 


9264503 


1.30E-50 


Glycine max 


53 


G198 


5050757 


4.10E-46 


Gossypium hirsutum [ 


53 


G198 


20562 


9.30E-46 


Petunia x hybrida 


53 


G198 


7765706 


2.70E-45 


Medicago truncatula 


53 


G198 


5860031 


5.40E-45 


Pinus taeda 


53 


G198 


4886263 


7.30E-45 


Antirrhinum majus 
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Pilgrim, Marsha 
Riechmann, Jose Luis 
Jiang, Cai-zhong 
Yu f Guo- Liang 
Pineda, Omaira 
Heard, Jacqueline 

<120> Seed Trait Genes 

<130> MBI-0017 

<150> 60/166,228 
<151> 1999-11-17 

<150> 60/197, 899 

<151> 2000-04-17 

<150> Plant Trait Modification III, 

<151> 2000-08-22 

<160> 54 

<170> Patentln version 3.0 

<210> 1 

<211> 2240 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

. <222> (238) . . (2064) 

<223> G214 

<400>" 1 

tgagatttct ccatttccgt agcttctggt ctcttttctt tgtttcattg atcaaaagca 60 

aatcacttct tcttcttctt cttctcgatt tcttactgtt ttcttatcca acgaaatctg. 120 

gaattaaaaa tggaatcttt atcgaatcca agctgatttt gtttctttca ttgaatcatc 180 

tctctaaagt ggaattttgt aaagagaaga tctgaagttg tgtagaggag cttagtg 237 

atg gag aca aat teg tct gga gaa gat ctg gtt att aag act egg aag 285 
Met Glu Thr Asn Ser Ser Gly Glu Asp Leu Val He Lys Thr Arg Lys 
1 5 10 15 

cca tat acg ata aca aag caa cgt gaa agg tgg act gag gaa gaa cat 333 
Pro Tyr Thr He Thr Lys Gin Arg Glu Arg Trp Thr Glu Glu Glu His 
20 A 25 30 

aat aga ttc att gaa get ttg agg ctt tat ggt aga gca tgg cag aag 381 
Asn Arg Phe He Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Lys 
35 40 45 

att gaa gaa cat gta gca aca aaa act get gtc cag ata aga agt cac 
He Glu Glu His Val Ala Thr Lys Thr Ala Val Gin He Arg Ser His 
50 55 60 

get cag aaa ttt ttc tec aag gta gag aaa gag get gaa get aaa ggt 477 
Ala Gin Lys Phe Phe Ser Lys Val Glu Lys Glu Ala Glu Ala Lys Gly 
65 70 75 80 

gta get atg ggt caa gcg eta gac ata get att cct cct cca egg cct 525 
Val Ala Met Gly Gin Ala Leu Asp He Ala He Pro Pro Pro Arg Pro 
85 90 95 

aag cgt aaa cca aac aat cct tat cct cga aag acg gga agt gga acg 573 
Lys Arg Lys Pro Asn Asn Pro Tyr Pro Arg Lys Thr Gly Ser Gly Thr 
100 105 110 
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ate ctt atg tea aaa acg ggt gtg aat gat gga aaa gag tec ctt gga 
lie Leu Met Ser Lys Thr Gly Val Asn Asp Gly Lys Glu Ser Leu Gly 
115 120 125 



621 



tea gaa aaa gtg teg cat cct gag atg gee aat gaa gat cga caa caa 
Ser Glu Lys Val Ser His Pro Glu Met Ala Asn Glu Asp Arg Gin Gin 
130 135 140 



669 



tea aag cct gaa gag aaa act ctg cag gaa gac aac tgt tea gat tgt 
Ser Lys Pro Glu Glu Lys Thr Leu Gin Glu Asp Asn Cys Ser Asp Cys 
145 150 155 " 160 



717 . 



ttc act cat cag tat etc tct get gca tec tec atg aat aaa agt tgt 
Phe Thr His Gin Tyr Leu Ser Ala Ala Ser Ser Met Asn Lys Ser Cys 
165 170 175 



765 



ata gag aca tea aac gca age act ttc cgc gag ttc ttg cct tea egg 
lie Glu Thr Ser Asn Ala Ser Thr Phe Arg Glu Phe Leu Pro Ser Arg 
180 185 190 



813 



gaa gag gga agt cag aat aac agg gta aga aag gag tea aac tea gat 
Glu Glu Gly Ser Gin Asn Asn Arg Val Arg Lys Glu Ser Asn Ser Asp 
195 200 ' " 205 



861 



ttg aat gca aaa tct ctg gaa aac ggt aat gag caa gga cct cag act 
Leu Asn Ala Lys Ser Leu Glu Asn Gly Asn Glu Gin Gly Pro Gin Thr 
210 215 220 



909 



tat ccg atg cat ate cct gtg eta gtg cca ttg ggg age tea ata aca 
Tyr Pro Met His lie Pro Val Leu Val Pro Leu Gly Ser Ser He Thr 
225 230 235 240 



957 



agt tct eta tea cat cct cct tea gag cca gat agt cat ccc cac aca 
Ser Ser Leu Ser His Pro Pro Ser Glu Pro Asp Ser His Pro His Thr 
245 250 255 



1005 



gtt gca gga gat tat cag teg ttt cct aat cat ata atg tea acc ctt 
Val Ala- Gly Asp Tyr Gin Ser Phe Pro Asn His He Met Ser Thr Leu 
260 265 270 



1053 



tta caa aca ccg get ctt tat act gee gca act ttc gee tea tea ttt 
Leu Gin Thr Pro Ala Leu Tyr Thr Ala Ala Thr Phe Ala Ser Ser Phe 
275 280 285 



1101 



tgg cct ccc gat tct agt ggt ggc tea cct gtt cca ggg aac tea cct 
Trp Pro Pro Asp Ser Ser Gly Gly Ser Pro Val Pro Gly Asn Ser Pro 
290 295 300 



1149 



ccg aat ctg get gec atg gec gca gec act gtt gca get get agt get 
Pro Asn Leu Ala Ala Met Ala Ala Ala Thr Val Ala Ala Ala Ser Ala 
305 310 315 320 



1197 



fc 99 tgg get gec aat gga tta tta cct tta tgt get cct ctt agt tea 
Trp Trp Ala Ala Asn Gly Leu Leu Pro Leu Cys Ala Pro Leu Ser Ser 
325 330 335 



1245 



99t ggt ttc act agt cat cct cca tct act ttt gga cca tea tgt gat 
Gly Gly Phe Thr Ser His Pro Pro Ser Thr Phe Gly Pro Ser Cys Asp 
340 345 " 350 



1293 



gta gag tac aca aaa gca age act tta caa cat ggt tct gtg cag age 
Val Glu Tyr Thr Lys Ala Ser Thr Leu Gin His Gly Ser Val Gin Ser 
355 360 365 



1341 



cga gag caa gaa cac tec gag gca tea aag get cga tct tea ctg gac 
Arg Glu Gin Glu His Ser Glu Ala Ser Lys Ala Arg Ser Ser Leu Asp 
370 375 380 



1389 



tea gag gat gtt gaa aat aag agt aaa cca gtt tgt cat gag cag cct 
Ser Glu Asp Val Glu Asn Lys Ser Lys Pro Val Cys His Glu Gin Pro 
385 390 395 400 



1437 



tct gca aca cct gag agt gat gca aag ggt tea gat gga gca gga gac 
Ser Ala Thr Pro Glu Ser Asp Ala Lys Gly Ser Asp Gly Ala Gly Asp 
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410 415 



aga aaa caa gtt gac egg tec teg tgt ggc tea aac act ccg teg agt 
Arg Lys Gin Val Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Ser 
420 425 430 

agt gat gat gtt gag gcg gat gca tea gaa agg caa gag gat ggc acc 
Ser Asp Asp Val Glu Ala Asp Ala Ser Glu Arg Gin Glu Asp Gly Thr 
435 440 445 

aat ggt gag gtg aaa gaa acg aat gaa gac act aat aaa cct caa act 
Asn Gly Glu Val Lys Glu Thr Asn Glu Asp Thr Asn Lys Pro Gin Thr 
450 " 455 .460 

tea gag tec aat gca cgc cgc agt aga ate age tec aat ata acc gat 
Ser Glu Ser Asn Ala Arg Arg Ser Arg He Ser Ser Asn He Thr Asp 
465 470 475 480 

cca tgg aag tct gtg tct gac gag ggt cga att gee ttc caa get etc 
Pro Trp Lys Ser Val Ser Asp Glu Gly Arg He Ala Phe Gin Ala Leu 
485 ^ 490 495 

ttc tec aga gag gta ttg ccg caa agt ttt aca tat cga gaa gaa cac 
Phe Ser Arg Glu Val Leu Pro Gin Ser Phe Thr Tyr Arg Glu Glu His 
500 505 510 

aga gag gaa gaa caa caa caa caa gaa caa aga tat cca atg gca ctt 
Arg Glu Glu Glu Gin Gin Gin Gin Glu Gin Arg Tyr Pro Met Ala Leu 
515 520 525 

gat ctt aac ttc aca get cag tta aca cca gtt gat gat caa gag gag 
Asp Leu Asn Phe Thr Ala Gin Leu Thr Pro Val Asp Asp Gin Glu Glu 
530 535 540 

aag aga aac aca gga ttt ctt gga ate gga tta gat get tea aag eta 
Lys Arg Asn Thr Gly Phe Leu Gly He Gly Leu Asp Ala Ser Lys Leu 

545 '"■ 550 555 560 

atg agt aga gga aga aca ggt ttt aaa cca- tac aaa aga tgt tec atg 
Met Ser Arg Gly Arg Thr Gly- Phe Lys Pro Tyr Lys Arg Cys Ser Met 
565 570 575 

gaa gec aaa gaa agt aga ate etc aac aac aat cct ate att cat gtg 
Glu Ala Lys Glu Ser Arg lie Leu Asn Asn Asn Pro He He His Val 
580 " 585 590 

gaa cag aaa gat ccc aaa egg atg egg ttg gaa act caa get tec aca 
Glu Gin Lys Asp Pro Lys Arg Met Arg Leu Glu Thr Gin Ala Ser Thr 
595 * 600 605 

tga gactctattt tcatctgatc tgttgtttgt actctgtttt taagttttca 

agaccactgc tacattttct ttttcttttg aggectttgt atttgtttcc ttgtccatag 

tcttcctgta acatttgact ctgtattatt caacaaatca taaactgttt aatctttttt 

tttcca 



1533 

1581 

1629 

1677 

1725 

1773 

1821 

1869 

1917 

1965 

2013 

2061 

2114 
2174 
2234 
2240 



<210> 2 
<211> 608 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 2 

Met Glu Thr Asn Ser Ser Gly Glu Asp Leu Val He Lys Thr Arg Lys 
15 10 15 

Pro Tyr Thr lie Thr Lys Gin Arg Glu Arg Trp Thr Glu Glu Glu His 
20 " 25 30 



Asn Arg Phe lie Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Lys 
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35 40 45 



He Glu Glu His Val Ala Thr Lys Thr Ala Val Gin He Arg Ser His 
50 55 60 



Ala Gin Lys Phe Phe Ser Lys Val Glu Lys Glu Ala Glu Ala Lys Gly 
65 70 75 80 



Val Ala Met Gly Gin Ala Leu Asp He Ala He Pro Pro Pro Arg Pro 
85 90 95 



Lys Arg Lys Pro Asn Asn Pro Tyr Pro Arg Lys Thr Gly Ser Gly Thr 
100 105 110 



He Leu Met Ser Lys Thr Gly Val Asn Asp Gly Lys Glu Ser Leu Gly 
115 120 125 



Ser Glu Lys Val Ser His Pro Glu Met Ala Asn Glu Asp Arg Gin Gin 
130 135 140 



Ser Lys Pro Glu Glu Lys Thr Leu Gin Glu Asp Asn Cys Ser Asp Cys 
145 150 155 * 160 

Phe Thr His Gin Tyr Leu Ser Ala Ala Ser Ser Met Asn Lys Ser Cys 
165 170 175 

He Glu Thr Ser Asn Ala Ser Thr Phe Arg Glu Phe Leu Pro Ser Arg 
180 185 ~ 190 



Glu Glu Gly Ser Gin Asn Asn Arg Val Arg Lys Glu Ser Asn Ser Asp 
195 200 205 



Leu Asn Ala Lys Ser Leu Glu Asn Gly Asn Glu Gin Gly Pro Gin Thr 
210 215 220 



Tyr Pro Met His He Pro Val Leu Val Pro Leu Gly Ser Ser He Thr 
225 230 235 240 



Ser Ser Leu Ser His Pro Pro Ser Glu Pro Asp Ser His Pro His Thr 
245 250 * 255 



Val Ala Gly Asp Tyr Gin Ser Phe Pro Asn His He Met Ser Thr Leu 
260 265 270 



Leu Gin Thr Pro Ala Leu Tyr Thr Ala Ala Thr Phe Ala Ser Ser Phe 
275 280 285 



Trp Pro Pro Asp Ser Ser Gly Gly Ser Pro Val Pro Gly Asn Ser Pro 
290 295 300 



Pro Asn Leu Ala Ala Met Ala Ala Ala Thr Val Ala Ala Ala Ser Ala 
305 310 315 320 



Trp Trp Ala Ala Asn Gly Leu Leu Pro Leu Cys Ala Pro Leu Ser Ser 
325 330 335 
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Gly Gly Phe Thr Ser His Pro Pro Ser Thr Phe Gly Pro Ser Cys Asp 
340 345 350 

Val Glu Tyr Thr Lys Ala Ser Thr Leu Gin His Gly Ser Val Gin Ser 
355 " 360 . 365 

Arg Glu Gin Glu His Ser Glu Ala Ser. Lys Ala Arg Ser Ser Leu Asp 
370 375 . 380 

Ser Glu Asp Val Glu Asn Lys Ser Lys Pro Val Cys His Glu Gin Pro 
385 390 395 400 

Ser Ala Thr Pro Glu Ser Asp Ala Lys Gly Ser Asp Gly Ala Gly Asp 
405 410 415 

Arg Lys Gin Val Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Ser 
420 425 430 

Ser Asp Asp Val Glu Ala Asp Ala Ser Glu Arg Gin Glu Asp Gly Thr 
435 440 445 

Asn Gly Glu Val Lys Glu Thr Asn Glu Asp Thr Asn Lys Pro Gin Thr 
450 - 455 460 

Ser Glu Ser Asn Ala Arg Arg Ser Arg He Ser Ser Asn He Thr Asp 
465 470 475 480 

Pro Trp Lys Ser Val Ser Asp Glu Gly Arg He Ala Phe Gin Ala Leu 
485 490 495 

Phe Ser Arg Glu Val Leu Pro Gin Ser Phe Thr Tyr Arg Glu Glu His 
500 505 510 

Arg Glu Glu Glu Gin Gin Gin Gin Glu Gin Arg Tyr Pro Met Ala Leu 
515 520 525 

Asp Leu Asn Phe Thr Ala Gin Leu Thr Pro Val Asp Asp Gin Glu Glu 
530 535 540 

Lys Arg Asn Thr Gly Phe Leu Gly He Gly Leu Asp Ala Ser Lys Leu 
545 550 555 560 

Met Ser Arg Gly Arg Thr Gly Phe Lys Pro Tyr Lys Arg Cys Ser Met 
565 570 575 

Glu Ala Lys Glu Ser Arg He Leu Asn Asn Asn Pro He He His Val 
580 ~ 585 590 

Glu Gin Lys Asp Pro Lys Arg Met Arg Leu Glu Thr Gin Ala Ser Thr 
595 600 605 



<210> 3 

<2H> 407 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 
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<222> (10).. (348) 
<223> G226 

<400> 3 

ccagtagtt atg gat aat acc aac cgt ctt cgt ctt cgt cgc ggt ccc agt 51 

Met Asp Asn Thr Asn Arg Leu Arg Leu Arg. Arg Gly Pro Ser 
1 5 10 

ctt agg caa act aag ttc act cga tec cga tat gac tct gaa gaa gtg . 99 
Leu Arg Gin Thr Lys Phe Thr Arg Ser Arg Tyr Asp Ser Glu Glu Val 
15 20 25 30 

agt age ate gaa tgg gag ttt ate agt atg acc gaa caa gaa gaa gat 147 
Ser Ser He Glu Trp Glu Phe He Ser Met Thr Glu Gin Glu Glu Asp 
35 40 45 

etc ate tct cga atg tac aga ctt gtc ggt aat agg tgg gat tta ata 1'95 
Leu He Ser Arg Met Tyr Arg Leu Val Gly Asn Arg Trp Asp Leu He 
50 55 60 

gca gga aga gtc gta gga aga aag gca aat gag att gag aga tac tgg 243 
Ala Gly Arg Val Val . Gly Arg Lys Ala Asn Glu He Glu Arg Tyr Trp 
65 70 75 

att atg aga aac tct gac tat ttt tct cac aaa cga cga cgt ctt aat 291 
He Met Arg Asn Ser Asp Tyr Phe Ser His Lys Arg Arg Arg Leu Asn 
80 85 90 

aat tct ccc ttt ttt tct act tct cct ctt aat etc caa gaa aat eta 339 
Asn Ser Pro Phe Phe Ser Thr Ser Pro Leu Asn Leu Gin Glu Asn Leu 
95 100 105 110 

aaa ttg taa agaaatcaaa ataaaagctt tcaatcataa aagtagaaca 388 
Lys Leu 

aatcttgaat gtcttctca .... 407 

<210> 4 

<211> 112 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 4 

Met Asp Asn Thr Asn Arg Leu Arg Leu Arg Arg Gly Pro Ser Leu Arg 
1 5 10 15 

Gin Thr Lys Phe Thr Arg Ser Arg Tyr Asp Ser Glu Glu Val Ser Ser 
20 25 " 30 

lie Glu Trp Glu Phe He Ser Met Thr Glu Gin Glu Glu Asp Leu He 
35 40 45 

Ser Arg Met Tyr Arg Leu Val Gly Asn Arg Trp Asp Leu He Ala Gly 
50 55 60 

Arg Val Val Gly Arg Lys Ala Asn Glu lie Glu Arg Tyr Trp lie Met 
65 70 75 ' 80 

Arg Asn Ser Asp Tyr Phe Ser His Lys Arg Arg Arg Leu Asn Asn Ser 
85 90 95 

Pro Phe Phe Ser Thr Ser Pro Leu Asn Leu Gin Glu Asn Leu Lys Leu 
100 105 110 
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<210> 5 

<211> 1209 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (41).. (1156) 

<223> G229 

<400> 5 

ttgtggtcag tggaataaac acatataacc gccggagaaa atg gga aga gcg cca 55 

Met Gly Arg Ala Pro 
1 5 

tgt tgc gag aag gtc ggt ate aag aga ggg egg tgg acg gcg gag gag 103 

Cys Cys Glu Lys Val Gly lie Lys Arg Gly Arg Trp Thr Ala Glu Glu 

.10 15 20 

gac cag att etc tec aac tac att caa tec aat ggt gaa ggt tct tgg 151 
Asp. Gin lie Leu Ser Asn Tyr lie Gin Ser Asn Gly Glu Gly Ser Trp 
25 30 35 

aga tct etc ccc aaa aat gee gga tta aaa agg tgt gga aag age tgt 199 
Arg Ser Leu Pro Lys Asn Ala Gly Leu Lys Arg Cys Gly Lys Ser Cys 
40 45 50 

aga ttg aga tgg ata aac tat eta aga tea gac etc aag cgt gga aac 247 
Arg Leu Arg Trp He Asn Tyr Leu Arg Ser Asp Leu Lys Arg Gly Asn 
55 60 65 

ata act cca gaa gaa gaa gaa etc gtt gtt aaa ttg cat tec act ttg 295 
He Thr Pro Glu Glu Glu Glu Leu Val Val Lys Leu His Ser Thr Leu 
70 75 80 85 

gga aac agg tgg tea eta ate gcg ggt cat eta cca ggg aga aca gac 343 
Gly Asn Arg Trp Ser Leu He Ala Gly His Leu Pro Gly Arg Thr Asp 

90 95 - TOO " 



aac gaa ata aaa aat tat tgg aac tct cat etc age cgt aaa etc cac 
Asn Glu He Lys Asn Tyr Trp Asn Ser His Leu Ser Arg Lys Leu His 
105 110 115 



atg gcg aac get tct tea gcg cca ccg ccg ccg cag gca aaa cgc aga 
Met Ala Asn Ala Ser Ser Ala Pro Pro Pro Pro Gin Ala Lys Arg Arg 
135 140 145 



aaa act cgt aaa acg aag aaa acg tct gca cca ccg gag cct aac gee 
Lys Thr Arg Lys Thr Lys Lys Thr Ser Ala Pro Pro Glu Pro Asn Ala 
170 175 180 



391 



aac ttc att agg aag cca tec ate tct caa gac gtc tec gee gta ate 439 
Asn Phe He Arg Lys Pro Ser lie Ser Gin Asp Val Ser Ala Val He 
120 125 130 



487 



ctt ggg aga acg agt agg tec get atg aaa cca aaa ate cgc aga aca 53 5 

Leu Gly Arg Thr Ser Arg Ser Ala Met Lys Pro Lys He Arg Arg Thr 
150 155 160 165 



583 



gat gta get ggg get gat aaa gaa gca tta atg gtg gag tea agt gga 631 
Asp Val Ala Gly Ala Asp Lys Glu Ala Leu Met Val Glu Ser Ser Gly 
185 190 195 

gee gag get gag eta gga cga cca tgt gac tac tat gga gat gat tgt 679 
Ala Glu Ala Glu Leu Gly Arg Pro Cys Asp Tyr Tyr Gly Asp Asp Cys 
200 205 210 

aac aaa aat etc atg age att aat ggc gat aat gga gtt tta acg ttt 727 
Asn Lys Asn Leu Met Ser He Asn Gly Asp Asn Gly Val Leu Thr Phe 
215 220 225 

gat gat gat ate ate gat ctt ttg ttg gac gag tea gat cct ggc cac 775 
Asp Asp Asp He lie Asp Leu Leu Leu Asp Glu Ser Asp Pro Gly His 
230 h 235 240 245 
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ttg tac aca aac aca acg tgc ggt ggt ggt ggg gag ttg cat aac ata 823 
Leu Tyr Thr Asn Thr Thr Cys Gly Gly Gly Gly Glu Leu His Asn He 
250 255 260 

aga gac tct gaa gga gcc aga ggg ttc teg gat act tgg aac caa ggg 871 
Arg Asp Ser Glu Gly Ala Arg Gly Phe Ser Asp Thr Trp Asn Gin Gly 
265 270 275 

aat etc gac tgt ctt ctt cag tct tgt cca tct gtg gag teg ttt etc 919 
Asn Leu Asp Cys Leu Leu Gin Ser Cys Pro Ser Val Glu Ser Phe Leu 
280 285 . 290 

aac tac gac cac caa gtt aac gac gcg teg acg gat gag ttt ate gat 967 
Ash Tyr Asp His Gin Val Asn Asp Ala Ser Thr Asp Glu Phe He Asp 
295 300 305 

tgg gat tgt gtt tgg caa gaa ggt agt gat aat aat ctt tgg cat gag 1015 
Trp Asp Cys Val Trp Gin Glu Gly Ser Asp Asn Asn Leu Trp His Glu 
310 ~ 315 320 325 

aaa gag aat ccc gac tea atg gtc teg tgg ctt tta gac ggt gat gat 1063 
Lys Glu Asn Pro Asp Ser Met Val Ser Trp Leu Leu Asp Gly Asp Asp 
330 335 340 

gag gcc acg ate ggg aat agt aat tgt gag aac ttt gga gaa ccg tta 1111 
Glu Ala Thr lie Gly Asn Ser Asn Cys Glu Asn Phe Gly Glu Pro Leu 
.345 350 355 

gat cat gac gac gaa age get ttg gtc get tgg ctt ctg tea tga 1156 
Asp His Asp Asp Glu Ser Ala Leu Val Ala Trp Leu Leu Ser 
360 "' 365 " 370 

tgatattgat tgatccgtta tgtaatcttt tttgtgcatt cacagtttga ate 1209 

<210> 6 

<211> 371 

- <212> PRT ' • " 

<213> Arabidopsis thaliana 

<400> 6 

Met Gly Arg Ala Pro Cys Cys Glu Lys Val Gly He Lys Arg Gly Arg 
1 " 5 * 10 15 

Trp Thr Ala Glu Glu Asp Gin He Leu Ser Asn Tyr He Gin Ser Asn 
20 25 30 

Gly Glu Gly Ser Trp Arg Ser Leu Pro Lys Asn Ala Gly Leu Lys Arg 
35 ~ 40 45 



Cys Gly Lys Ser Cys Arg Leu Arg Trp He Asn Tyr Leu Arg Ser Asp 
50 55 60 



Leu Lys Arg Gly Asn He Thr Pro Glu Glu Glu Glu Leu Val Val Lys 
65 70 75 80 



Leu His Ser Thr Leu Gly Asn Arg Trp Ser Leu He Ala Gly His Leu 
85 90 95 



Pro Gly Arg Thr Asp Asn Glu lie Lys Asn Tyr Trp Asn Ser His Leu 
100 105 110 



Ser Arg Lys Leu His Asn Phe He Arg Lys Pro Ser He Ser Gin Asp 
115 120 125 
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Val Ser Ala Val lie Met Ala Asn Ala Ser Ser Ala Pro Pro Pro Pro 
130 135 140 

Gin Ala Lys Arg Arg Leu Gly Arg Thr Ser Arg Ser Ala Met Lys Pro 
145 150 155 160 

Lys lie Arg Arg Thr Lys Thr Arg Lys Thr Lys Lys Thr Ser Ala Pro 
165 170 175 

Pro Glu Pro Asn Ala Asp Val Ala Gly Ala Asp Lys Glu Ala Leu Met 
180 185 190 

Val Glu Ser Ser Gly Ala Glu Ala Glu Leu Gly Arg Pro Cys Asp Tyr 
195 200 205 

Tyr Gly Asp Asp Cys Asn Lys Asn Leu Met Ser lie Asn Gly Asp Asn 
210 " ~ 215 220 

Gly Val Leu Thr Phe Asp Asp Asp lie He Asp Leu Leu Leu Asp Glu 
225 230 235 240 

Ser Asp Pro Gly His Leu Tyr Thr Asn Thr Thr Cys Gly Gly Gly Gly 
. 245 250 255 

Glu Leu His Asn He Arg Asp Ser Glu Gly Ala Arg Gly Phe Ser Asp 
260 265 270 

Thr Trp Asn Gin Gly Asn Leu Asp Cys Leu Leu Gin Ser Cys Pro Ser 

275 280 . .285 

Val Glu Ser Phe Leu Asn Tyr Asp His Gin Val Asn Asp Ala Ser Thr 
290 295 300 

Asp Glu Phe He Asp Trp Asp Cys Val Trp Gin Glu Gly Ser Asp Asn 
305 310 315 320 

Asn Leu Trp His Glu Lys Glu Asn Pro Asp Ser Met Val Ser Trp Leu 
325 330 335 

Leu Asp Gly Asp Asp Glu Ala Thr He Gly Asn Ser Asn Cys Glu Asn 
340 345 350 

Phe Gly Glu Pro Leu Asp His Asp Asp Glu Ser Ala Leu Val Ala Trp 
355 360 365 



Leu Leu Ser 
370 



<210> 7 

<211> 1046 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (46).. (867) 

<223> G241 
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<400> 7 

gaaaaacatt tcaacttctt ttatcagcaa tcacaaatca aagag atg gga aga get 57 

Met Gly Arg Ala 

1 ■ ■ 

cca tgc tgt gag aag atg ggg ttg aag aga gga cca tgg aca cct gaa 105 
Pro Cys Cys Glu Lys Met Gly Leu Lys Arg Gly Pro Trp Thr Pro Glu 
.5 10 15 20 

gaa gat caa ate ttg gtc tct ttt ate etc aac cat gga cat agt aac 153 
Glu Asp Gin lie Leu val Ser Phe He Leu Asn His Gly His Ser Asn 
25 30 35 

tgg cga gee etc cct aag caa get ggt ctt ttg aga tgt gga aaa age 201 
Trp Arg Ala Leu Pro Lys Gin Ala Gly Leu Leu Arg Cys Gly Lys Ser 
40 45 50 

tgt aga ctt agg tgg atg aac tat tta aag cct gat att aaa cgt ggc 249 
Cys Arg Leu Arg Trp Met Asn Tyr Leu Lys Pro Asp He Lys Arg Gly 
55 60 65 

aat ttc ace aaa gaa gag gaa gat get ate ate age tta cac caa ata . 297 
Asn Phe Thr Lys Glu Glu Glu Asp Ala He He Ser Leu His Gin He 
70 -75 80 . 

ctt ggc aat aga tgg tea gcg att gca gca aaa ctg cct gga aga ace 345 
Leu Gly Asn Arg Trp Ser Ala He Ala Ala Lys Leu Pro Gly Arg Thr 
85 - 90 95. 100 

gat aac gag ate aag aac gta tgg cac act cac ttg aag aag aga etc 393 
Asp Asn Glu He Lys Asn Val Trp His Thr His Leu Lys Lys Arg Leu" 
105 110 115 

gaa gat tat caa cca get aaa cct aag ace age aac aaa aag aag ggt 441 
Glu Asp Tyr Gin Pro Ala Lys Pro Lys Thr Ser Asn Lys Lys Lys Gly 
120 125 130 

act aaa cca aaa tct gaa tec gta ata acg age teg aac agt act aga 4 89 

Thr Lys Pro Lys Ser Glu Ser Val He Thr Ser Ser Asn Ser Thr Arg 
135 140 145 

age gaa teg gag eta gca gat tea tea aac cct tct gga gaa age tta 537 
Ser Glu Ser Glu Leu Ala Asp Ser Ser Asn Pro Ser Gly Glu Ser Leu 
150 155 160 

ttt teg aca teg cct teg aca agt gag gtt. tct teg atg aca etc ata 585 
Phe Ser Thr Ser Pro Ser Thr Ser Glu Val Ser Ser Met Thr Leu He 
165 170 175 180 

age cac gac ggc tat age aac gag att aat atg gat aac aaa ccg gga 633 
Ser His Asp Gly Tyr Ser Asn Glu He Asn Met Asp Asn Lys Pro Gly 
185 190 195 

gat ate agt act ate gat caa gaa tgt gtt tct ttc gaa act ttt ggt 681 
Asp He Ser Thr He Asp Gin Glu Cys Val Ser Phe Glu Thr Phe Gly 
200 205 210 

gcg gat ate gat gaa age ttc tgg aaa gag aca ctg tat age caa gat 729 
Ala Asp He Asp Glu Ser Phe Trp Lys Glu Thr Leu Tyr Ser Gin Asp 
215 220 225 

gaa cac aac tac gta teg aat gac eta gaa gtc get ggt tta gtt gag 777 
Glu His Asn Tyr Val Ser Asn Asp Leu Glu Val Ala Gly Leu Val Glu 
230 235 240 

ata caa caa gag ttt caa aac ttg ggc tec get aat aat gag atg att 825 
He Gin Gin Glu Phe Gin Asn Leu Gly Ser Ala Asn Asn Glu Met He 
245 250 255 260 

ttt gac agt gag atg gaa ctt ctg gtt cga tgt att ggc tag 867 
Phe Asp Ser Glu Met Glu Leu Leu Val Arg Cys He Gly 
265 270 

aaceggeggg gaacaagatc tettagcegg gctctagtta acatgtttga ggagtaaagt 927 
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gaaatggcgc aaattagtta aggctaagaa attcaaaagc ttttgtttac cgagaaaaaa 987 

acacactcta actcttgatg tgatgtagtt agtgtattaa ttagaggctg cgttttcaa 1046 

<210> 8 

<211> 273 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 8 

Met Gly Arg. Ala Pro Cys Cys Glu Lys Met Gly Leu Lys Arg Gly Pro 
1 ~ 5 .10 15 

Trp Thr Pro Glu Glu Asp Gin He Leu Val Ser Phe He Leu Asn His 
20 25 30 

Gly His Ser Asn Trp Arg Ala Leu Pro Lys Gin Ala Gly Leu Leu Arg 
35 40 45 

Cys Gly Lys Ser Cys Arg Leu Arg Trp Met Asn Tyr Leu Lys Pro Asp 
50 55 60 

He Lys Arg Gly Asn Phe Thr Lys Glu Glu Glu Asp Ala He He Ser 
65 70 .75 80 

Leu His Gin He Leu Gly Asn Arg Trp Ser Ala He Ala Ala Lys Leu 
85 90 95 

Pro Gly Arg Thr Asp Asn Glu He Lys Asn Val Trp His Thr His Leu 
100 - .105 - HO 

Lvs Lys Arq Leu Glu Asp Tyr Gin Pro Ala Lys Pro Lys Thr Ser Asn 
US 120 125 

Lvs Lys Lys Gly Thr Lys Pro Lys Ser Glu Ser Val He Thr Ser Ser 
130 135 140 

Asn Ser Thr Arg Ser Glu Ser Glu Leu Ala Asp Ser Ser Asn Pro Ser 
145 ~ 150 155 160 

Gly Glu Ser Leu Phe Ser Thr Ser Pro Ser Thr Ser Glu Val Ser Ser 
165 170 175 

Met Thr Leu He Ser His Asp Gly Tyr Ser Asn Glu He Asn Met Asp 
180 185 190 

Asn Lys Pro Gly Asp He Ser Thr He Asp Gin Glu Cys Val Ser Phe 
195 200 205 

Glu Thr Phe Gly Ala Asp He Asp Glu Ser Phe Trp Lys Glu Thr Leu 
210 215 220 

Tyr Ser Gin Asp Glu His- Asn Tyr Val Ser Asn Asp Leu Glu Val Ala 
225 " 230 235 240 

Gly Leu Val Glu He Gin Gin Glu Phe Gin Asn Leu Gly Ser Ala Asn 
245 250 255 
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Asn Glu Met lie Phe Asp Ser Glu Met Glu Leu Leu Val Arg Cys lie 
260 265 270 

Gly 



<210> 9. 

<211> 989 \ . 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (41) . . (664) 

<223> G464 

<400> 9 

ctctgctggt atcattggag tctagggttt tgttattgac atg cgt ggt gtg tea 55 

Met Arg Gly Val Ser 
1 5 

gaa ttg gag gtg ggg aag agt aat ctt ccg gcg gag agt gag ctg gaa 103 

Glu Leu Glu Val Gly Lys Ser Asn Leu Pro Ala Glu Ser Glu Leu Glu 
10 15 20 

ttg gga tta ggg etc age etc ggt ggt ggc gcg tgg aaa gag cgt ggg 151 
Leu Gly Leu Gly Leu Ser Leu Gly Gly Gly Ala Trp Lys ' Glu Arg Gly 
25 30 35 

agg att ctt act get aag gat ttt cct tec gtt ggg tct aaa cgc tct 199 
Arg lie Leu Thr Ala Lys Asp Phe Pro Ser Val Gly Ser. Lys Arg Ser 
40 45 50 

get gaa tct tec tct cac caa gga get tct cct cct cgt tea agt caa 247 
Ala Glu Ser Ser Ser His Gin Gly Ala Ser Pro Pro Arg Ser Ser Gin 
55 60 65 

gtg gta gga tgg cca cca att ggg tta cac agg atg aac agt ttg gtt 295 
Val Val Gly Trp Pro Pro lie Gly Leu His Arg Met Asn Ser Leu Val 
70 75 80 85 

aat aac caa get atg aag gca gca aga gcg gaa gaa gga gac ggg gag 343 
Asn Asn Gin Ala Met Lys Ala Ala Arg Ala Glu Glu Gly Asp Gly Glu 
90 95 1 100 

aag aaa gtt gtg aag aat ggt gag etc aaa gat gtg tea atg aag gtg 391 
Lys Lys Val Val Lys Asn Gly Glu Leu Lys Asp Val Ser Met Lys Val 
105 110 115 

aat ccg aaa gtt cag ggc tta ggg ttt gtt aag gtg aat atg gat gga 439 
Asn Pro Lys Val Gin Gly Leu Gly Phe Val Lys Val Asn Met Asp Gly 
120 125 130 

gtt ggt ata ggc aga aaa gtg gat atg aga get cat teg tct tac gaa 487 
Val Gly lie Gly Arg Lys Val Asp Met Arg Ala His Ser Ser Tyr Glu 
135 140 145 

aac ttg get cag acg ctt gag gaa atg ttc ttt gga atg aca ggt act 535 
Asn Leu Ala Gin Thr Leu Glu Glu Met Phe Phe Gly Met Thr Gly Thr 
150 155 160 165 

act tgt cga gaa acg gtt aaa cct tta agg ctt tta gat gga tea tea 583 
Thr Cys Arg Glu Thr Val Lys Pro Leu Arg Leu Leu Asp Gly Ser Ser 
170 175 180 

gac ttt gta etc act tat gaa gat aag ggg att gga tgc ttg ttg gag 631 
Asp Phe Val Leu Thr Tyr Glu Asp Lys Gly lie Gly Cys Leu Leu Glu 
185 " 190 195 

atg ttc cat gga gaa tgt. tta tea act egg tga aaaggcttcg gatcatggga 684 
Met Phe His Gly Glu Cys Leu Ser Thr Arg 
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200 205 

acctcagaag ctagtggact agctccaaga cgtcaagagc agaaggatag acaaagaaac 744 

aaccctgttt agcttccctt ccaaagctgg cattgtttat gtattgtttg aggtttgcaa 804 

tttactcgat actttttgaa gaaagtattt tggagaatat ggataaaagc atgcagaagc 864 

ttagatatga tttgaatccg gttttcggat atggttttgc ttaggtcatt caactcgtag 924 

ttttccagtt tgtttcttct ttggctgtgt accaattatc tatgttctgt gagagaaagc 984 

tcttg 

<2lO> 10 
<211> 207 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 10 

Met Arq Gly Val Ser Glu Leu Glu Val Gly Lys Ser Asn Leu Pro Ala 
IS 10 15 

Glu Ser Glu Leu Glu Leu Gly Leu Gly Leu Ser Leu Gly Gly Gly Ala 
20, 25 30 

Trp Lys Glu Arg Gly Arg lie Leu Thr Ala Lys Asp Phe Pro Ser Val 
35 40 45 

Gly Ser Lys Arg Ser Ala Glu Ser Ser Ser His Gin Gly Ala Ser Pro 
50 ~ 55 60 

Pro Arg Ser Ser Gin Val Val Gly Trp Pro Pro He Gly. Leu His Arg 
65 70 75 80 

Met Asn Ser Leu Val Asn Asn Gin Ala Met Lys Ala Ala Arg Ala Glu 
85 90 95 

Glu Gly Asp Gly Glu Lys Lys Val Val Lys Asn Gly Glu Leu Lys Asp 
100 105 HO 

Val Ser Met Lys Val Asn Pro Lys Val Gin Gly Leu Gly Phe Val Lys 
115 120 125 

Val Asn Met Asp Gly Val Gly He Gly Arg Lys Val Asp Met Arg Ala 
130 ' 135 140 

His Ser Ser Tyr Glu Asn Leu Ala Gin Thr Leu Glu Glu Met Phe Phe 
145 ' 150 155 160 

Glv Met Thr Gly Thr Thr Cys Arg Glu Thr Val Lys Pro Leu Arg Leu 
165 ' 170 175 

Leu Asp Gly Ser Ser Asp Phe Val Leu Thr Tyr Glu Asp Lys Gly He 
180 185 190 

Gly Cys Leu Leu Glu Met Phe His Gly Glu Cys Leu Ser Thr Arg 
195 200 205 

<210> 11 
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<211> 1033 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (113) . . (862) 

<223> G663 

<400> 11 

gtcgacccac gcgtccgtgg gaagccacaa taacccccta ttcctcggcc ttttttaaaa 60 

aagttttaga ataatccgat aaaatacttt tatattaatt tttctttggt cc atg gag 118 

Met Glu 
1 

ggt teg tec aaa ggg ttg agg aaa ggt gca tgg act get gaa gaa gat 166 
Gly Ser Ser Lys Gly Leu Arg Lys Gly Ala Trp Thr Ala Glu Glu Asp 
5 10 15 

agt etc ttg agg eta tgt att gat aag tat gga gaa ggc aaa tgg cat 214 
Ser Leu Leu Arg Leu Cys lie Asp Lys Tyr Gly Glu Gly Lys Trp His 
20 25 30 

caa gtt cct ttg aga get ggg eta aat cga tgc aga aag agt tgt aga t 262 
Gin Val Pro Leu Arg Ala Gly Leu Asn Arg Cys Arg Lys Ser Cys Arg 
35 40 45 50 

eta aga tgg ttg aac tat ttg aag cca agt ate aag aga gga aga ctt 310 
Leu Arg Trp Leu Asn Tyr Leu Lys Pro Ser lie Lys Arg Gly Arg Leu 
55 60 65 

age aat gat gaa gtt gat ctt ctt ctt cgc ctt cat aag ctt eta gga 358 
Ser Asn Asp Glu Val Asp Leu Leu Leu Arg Leu His Lys Leu Leu Gly 
70 75 80 

aat agg tgg tec ttg att get ggt cga ttg cct ggt egg acc get aat ■ 406 
Asn Arg Trp Ser Leu He Ala Gly Arg Leu Pro Gly Arg Thr Ala Asn 
85 90 95 

gat gtc aaa aat tac tgg aac acc cat ctg agt aaa aaa cat gag tct 454 
Asp Val Lys Asn Tyr Trp Asn Thr His Leu Ser Lys Lys His Glu Ser 
100 " 105 110 

teg tgt tgt aag tct aaa atg aaa aag aaa aac att att tec cct cct 502 
Ser Cys Cys Lys Ser Lys Met Lys Lys Lys Asn He He Ser Pro Pro 
115 120 125 130 

aca aca ccg gtc caa aaa ate ggt gtt ttt aag cct cga cct cga tec 550 
Thr Thr Pro Val Gin Lys He Gly Val Phe Lys Pro Arg Pro Arg Ser 
135 140 * ~ 145 

ttc tct gtt aac aat ggt tgc age cat etc aat ggt ctg cca gaa gtt 598 
Phe Ser Val Asn Asn Gly Cys Ser His Leu Asn Gly Leu Pro Glu Val 
150 155 160 

gat tta att cct tea tgc ctt gga etc aag aaa aat aat gtt tgt gaa 646 
Asp Leu He Pro Ser Cys Leu Gly Leu Lys Lys Asn Asn Val Cys Glu 
165 170 * 175 

aat agt ate aca tgt aac aaa gat gat gag aaa gat gat ttt gtg aat 694 
Asn Ser He Thr Cys Asn Lys Asp Asp Glu Lys Asp Asp Phe Val Asn 
180 ~ 185 190 

aat eta atg aat gga gat aat atg tgg ttg gag aat tta ctg ggg gaa 742 
Asn Leu Met Asn Gly Asp Asn Met Trp Leu Glu Asn Leu Leu Gly Glu 
195 200 * 205 210 

aac caa gaa get gat gcg att gtt cct gaa gcg acg aca get gaa cat 790 
Asn Gin Glu Ala Asp Ala He Val Pro Glu Ala Thr Thr Ala Glu His 
215 220 225 

ggg gee act ttg gcg ttt gac gtt gag caa ctt tgg agt ctg ttt gat 838 
Gly Ala Thr Leu Ala Phe Asp Val Glu Gin Leu Trp Ser Leu Phe Asp 
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230 235 240 

gga gag act gtt gaa ctt gat tag tgtttctcac cgtttgttta agattgtggg 892 
Gly Glu Thr Val Glu Leu Asp 
245 - 

tggcttttct ttcgtatttt agtaatgtat ttttctgtat gaagtaaaga atttcagcat 952 
tttaagaaaa atggttatgt ttctacgtaa taaaaaaaaa cgttatttat aaaaaaaaaa 1012 



aaaaaaaaaa aaaaaaaaaa a 



<210> 12 

<211> 249 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 12 

Met Glu Gly Ser Ser Lys Gly Leu Arg Lys Gly Ala Trp Thr Ala Glu 
1 5 10 15 

Glu Asp Ser Leu Leu Arg Leu Cys He Asp Lys Tyr Gly Glu Gly Lys 
20 25 30 

Trp His Gin- Val Pro Leu Arg Ala Gly Leu Asn Arg Cys Arg Lys Ser 
35 40 45 

Cys Arg Leu Arg Trp Leu Asn Tyr Leu Lys Pro Ser He Lys Arg Gly 
50 55 60 

Arg Leu Ser Asn Asp Glu Val Asp Leu Leu Leu Arg Leu His Lys Leu 
65 . 70 * 75 80 

Leu Gly Asn Arg Trp Ser Leu He Ala Gly Arg Leu Pro Gly Arg Thr 
85 90 95 

Ala Asn Asp Val Lys Asn Tyr Trp Asn Thr His Leu Ser Lys Lys His 
100 105 HO 

Glu Ser Ser Cys Cys Lys Ser Lys Met Lys Lys Lys Asn He He Ser 
115 ^ 120 125 

Pro Pro Thr Thr Pro Val Gin Lys He Gly Val Phe Lys Pro Arg Pro 
130 135 140 

Arq Ser Phe Ser Val Asn Asn Gly Cys Ser His Leu Asn Gly Leu Pro 
145 150 155 160 

Glu Val Asp Leu He Pro Ser Cys Leu Gly Leu Lys Lys Asn Asn Val 
165 170 175 

Cys Glu Asn Ser He Thr Cys Asn Lys Asp Asp Glu Lys Asp Asp Phe 
180 185 190 

Val Asn Asn Leu Met Asn Gly Asp Asn Met Trp Leu Glu Asn Leu Leu 
195 200 205 

Gly Glu Asn Gin Glu Ala Asp Ala He Val Pro Glu Ala Thr Thr Ala 
210 215 220 
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Glu His Gly Ala Thr Leu Ala Phe Asp Val Glu Gin Leu Trp Ser Leu 
225 " 230 235 240 



Phe Asp Gly Glu Thr Val Glu Leu Asp 
245 



<210> 


13 


<211> 


1640 


<212> 


DNA 


<213> 


Arabidopsis thai i ana 


<220> 




<221> 


CDS 


<222> 


(76) . . (1431) 


<223> 


G776 


<400> 


13 



tgcaattgaa ggtgaggttt ggtgaaaggg aaattgagaa aaccctagaa caagtacggt 60 

ctctattttg cttta atg ggt cgc gaa tct gtg get gtt gtg act gcg ccg 111 
Met Gly Arg Glu Ser Val Ala Val Val Thr Ala Pro 
15 10 

ccc teg gcg act get ccg ggt act get teg gtg gcg acc teg ctt get 159 
Pro Ser Ala Thr Ala Pro Gly Thr Ala Ser Val Ala Thr Ser Leu Ala 
15 20 25 



cct ggc ttc cga ttt cat ccg act gat gag gaa etc gtg age tat tac 
Pro Gly Phe Arg Phe His Pro Thr Asp Glu Glu Leu Val Ser Tyr Tyr 
30 35 40 



207 



ttg aag agg aag gtt ctg ggc caa cct- gta cgc ttc gat gcg att gga 255 
Leu Lys Arg Lys Val Leu Gly Gin Pro Val Arg Phe Asp Ala lie Gly 
45 50 55 60 

gag gtc gat ata tac aag cat gag ccc tgg gat tta gca gtg ttt teg 303 
Glu Val Asp lie Tyr Lys His Glu Pro Trp Asp Leu Ala Val Phe Ser 
65 * 70 75 

aga ttg aag aca agg gac caa gaa tgg tac ttc tac agt gca tta gat 351 
Arg Leu Lys Thr Arg Asp Gin Glu Trp Tyr Phe Tyr Ser Ala Leu Asp 
80 85 SO 

aag aag tat gga aac ggt get agg atg aac cga gca act aac aga ggg 3 99 

Lys Lys Tyr Gly Asn Gly Ala Arg Met Asn Arg Ala Thr Asn Arg Gly 
95 ' 100 ~ 105 

tac tgg aaa get act gga aaa gac aga gaa ate cgc cgt gac att ctg 447 
Tyr Trp Lys Ala Thr Gly Lys Asp Arg Glu lie Arg Arg Asp lie Leu 
110 115 120 

ctt etc ggt atg aaa aag aca ctt gtt ttc cac agt ggg cgt gca cca 495 
Leu Leu Gly Met Lys Lys Thr Leu Val Phe His Ser Gly Arg Ala Pro 
125 130 135 140 

gac ggg ctt egg act aat tgg gtt atg cat gag tat cgc ctt gtg gaa 543 
Asp Gly Leu Arg Thr Asn Trp Val Met His Glu Tyr Arg Leu Val Glu 
145 150 " 155 

tat gaa acc gag aaa aac gga aac ctg gtg caa gat gca tat gtg ttg 591 
Tyr Glu Thr Glu Lys Asn Gly Asn Leu Val Gin Asp Ala Tyr Val Leu 
160 165 170 

tgt aga gtc ttc cac aag aat aac att ggg cca cca agt ggg aac aga 63 9 

Cys Arg Val Phe His Lys Asn Asn lie Gly Pro Pro Ser Gly Asn Arg 
175 180 185 

tat get ccg ttc atg gaa gag gaa tgg get gat gat gaa gga get ctg 687 
Tyr Ala Pro Phe Met Glu > Glu Glu Trp Ala Asp Asp Glu Gly Ala Leu 
190 195 200 
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att cca gga ata gac gtt aag etc agg eta gag ccg ccg cca gta gee 
He Pro Gly He Asp Val Lys Leu Arg Leu Glu Pro Pro Pro Val Ala 
205 210 215 220 

aat gga aac gac cag atg gac cag gaa ate cag tea gee age aag agt 
Asn Gly Asn Asp Gin Met Asp Gin Glu He Gin Ser Ala Ser Lys Ser 
225 ' 230 235 

etc ate aac ate aat gag cca ccg aga gag aca get cca ctg gat ate 
Leu He Asn He Asn Glu Pro Pro Arg Glu Thr Ala Pro Leu Asp He 
240 245 250 

gaa teg gac. caa cag aat cat cat gag aat gac etc aag ccg gag gag 
Glu Ser Asp Gin Gin Asn His His Glu Asn Asp Leu Lys Pro Glu Glu 
255 260 265 

cat aac aac aat aat aat tat gat gaa aac gag gaa aca etc aaa cgc 
His Asn Asn Asn Asn Asn Tyr Asp Glu Asn Glu Glu Thr Leu Lys Arg 
270 275 280 

gag cag atg gaa gaa gag gag cgt cct cct cga cct gta tgc gtt etc 
Glu Gin Met Glu Glu Glu Glu Arg Pro Pro Arg Pro Val Cys Val Leu 
285 290 295 300 

aac aaa gaa get cca tta cct ctt ctg caa tac aaa cgt aga cgc caa 
Asn Lys Glu Ala Pro Leu Pro Leu Leu Gin Tyr Lys Arg Arg Arg Gin 
305 310 315 . 

age gag tec aac aac aac tea age agg aac aca cag gac cat tgt teg 
Ser Glu Ser Asn Asn Asn Ser Ser Arg Asn Thr Gin Asp His Cys Ser 
320 325 330 

tec aca aca aca act gtc gac aat aca ace act tta ate tea tea tct 
Ser Thr Thr Thr Thr Val Asp Asn Thr Thr Thr Leu He Ser Ser Ser 
335 340 345 

gee get gee acc aac act gee ate tct gca ttg ctt gag ttc tea etc 
Ala Ala Ala Thr Asn Thr Ala lie Ser Ala Leu Leu Glu Phe Ser Leu 
350 ■ 355 360 

atg ggt ate tec gac aag aaa gaa aag ccg cag caa ccg eta cgt cct 
Met Gly He Ser Asp Lys Lys Glu Lys Pro Gin Gin Pro Leu Arg Pro 
365 370 375 380 

cac aag gaa cct ttg cct cct caa act cca ctt gca tct cct gaa gag 
His Lys Glu Pro Leu Pro Pro Gin Thr Pro Leu Ala Ser Pro Glu Glu 
385 390 395 

aag gtt aat gat etc cag aag gag att cac cag atg tct gtt gaa aga 
Lys Val Asn Asp Leu Gin Lys Glu lie His Gin Met Ser Val Glu Arg 
400 405 410 

gaa act ttc aag ctt gaa atg atg agt gca gaa get atg ate agt att 
Glu Thr Phe Lys Leu Glu Met Met Ser Ala Glu Ala Met He Ser He 
415 420 425 

etc cag tea agg ate gat gcg ctg cgt cag gag aac gag gaa etc aag 
Leu Gin Ser Arg lie Asp Ala Leu Arg Gin Glu Asn Glu Glu Leu Lys 
430 " 435 440 

aag aac aat get aat gga caa taa aggctctaaa aacatctctc caggttactt 
Lys Asn Asn Ala Asn Gly Gin 
445 450 

cttattgccc ttcgcctttt atttagcttt aatctcccta atactatgac ccatctacat 
agctcctcta gaeagattge gaactgtgtg aatctctgtt gtaacatagg ataaaaegga 
ttcgagcccc tgagctgagt gttttatcct tcttctttta aaaaaaaaaa aaaaaaaaa 



735 

783 

831 

879 

927 

975 

1023 

1071 

1119 

1167 

1215 

1263 

1311 

1359 

1407 

1461 

1521 
1581 
1640 



<210> 14 

<211> 451 

<212> PRT 

<213> Arabidopsis thaliana 
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<400> 14 

Met Gly Arg Glu Ser Val Ala Val Val Thr Ala Pro Pro Ser Ala Thr 
1 5 10 15 



Ala Pro Gly Thr Ala Ser Val Ala Thr Ser Leu Ala Pro Gly Phe Arg 
20 25 30 

Phe His Pro Thr Asp Glu Glu Leu Val Ser Tyr Tyr Leu Lys Arg Lys 

35 40 45 



Val Leu Gly Gin Pro Val Arg Phe Asp Ala lie Gly Glu Val Asp lie 
50 55 60 



Tyr Lys His Glu Pro Trp Asp Leu Ala Val Phe Ser Arg Leu Lys Thr 
65 70 75 80 



Arg Asp Gin Glu Trp Tyr Phe Tyr Ser Ala Leu Asp Lys Lys Tyr Gly 
85 90 95 



Asn Gly Ala Arg Met Asn Arg Ala Thr- Asn Arg Gly Tyr Trp Lys Ala 
100 105 110 



Thr Gly Lys Asp Arg Glu lie Arg Arg Asp lie Leu Leu Leu Gly Met 
115 120 125 



Lys Lys Thr Leu Val Phe His Ser Gly Arg Ala Pro Asp Gly Leu Arg 
. 130 135 140 



Thr Asn Trp Val Met His Glu Tyr Arg Leu Val Glu Tyr Glu Thr Glu 
145 150 155 160 



Lys Asn Gly Asn Leu Val Gin Asp Ala Tyr Val Leu Cys Arg Val Phe 
165 170 175 



His Lys Asn Asn lie Gly Pro Pro Ser Gly Asn Arg Tyr Ala Pro Phe 
180 * 185 190 



Met Glu Glu Glu Trp Ala Asp Asp Glu Gly Ala Leu lie Pro Gly lie 
195 200 205 



Asp Val Lys Leu Arg Leu Glu Pro Pro Pro Val Ala Asn Gly Asn Asp 
210 215 220 



Gin Met Asp Gin Glu lie Gin Ser Ala Ser Lys Ser Leu lie Asn He 
225 230 235 240 



Asn Glu Pro Pro Arg Glu Thr Ala Pro Leu Asp He Glu Ser Asp Gin 
245 250 255 



Gin Asn His His Glu Asn Asp Leu Lys Pro Glu Glu His Asn Asn Asn 
260 265 270 



Asn Asn Tyr Asp Glu Asn Glu Glu Thr Leu Lys Arg Glu Gin Met Glu 
275 280 285 
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MBI-17 Sequence Listing. ST25 
, Glu Glu Glu Arg Pro Pro Arg Pro Val Cys Val Leu Asn Lys Glu Ala 
290 . 295 300 

Pro Leu Pro Leu Leu Gin Tyr Lys Arg Arg Arg Gin Ser Glu Ser Asn 
305 310 315 320 

Asn Asn Ser Ser Arg Asn Thr Gin Asp His Cys Ser Ser Thr Thr Thr 
325 330 335 

Thr Val Asp Asn Thr Thr Thr Leu He Ser Ser Ser Ala Ala Ala Thr 
340 345 350 

Asn Thr Ala He Ser Ala Leu Leu Glu Phe Ser Leu Met Gly He Ser 
355 360 365 

Asp Lys Lys Glu Lys Pro Gin Gin Pro Leu Arg Pro His Lys Glu Pro 
370 375 380 

Leu Pro Pro Gin Thr Pro Leu Ala Ser Pro Glu Glu Lys Val Asn Asp 
385 390 395 400 

Leu Gin Lys Glu He His Gin Met Ser Val Glu Arg Glu Thr Phe Lys 
405 410 415 

Leu Glu Met Met Ser Ala Glu Ala Met He Ser He Leu Gin Ser Arg 
420 425 430 

He Asp Ala Leu Arg Gin Glu Asn Glu Glu Leu Lys Lys Asn Asn Ala 
435 ~ 440 445 



Asn Gly Gin 
450 



<210> 


15 


<211> 


1389 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


{50} . . (1249) 


<223> 


G778 


<400> 


15 



tctcaataac acaaaacctt ttaaactagt aaaatacaca gattttagg atg age caa 

Met Ser Gin 
1 

tgt gtt cca aac tgt cac ate gat gat act ccg gca gca gec acc acc 
Cys Val Pro Asn Cys His He Asp Asp Thr Pro Ala Ala Ala Thr Thr 
5 " 10 15 

acc gtc cgc tec acc aca gec gca gac ate ccc ata tta gac tac gag 
Thr Val Arg Ser Thr Thr Ala Ala Asp He Pro He Leu Asp Tyr Glu 
20 " 25 30 35 

gta gec gag ctg acg tgg gag aac ggg caa eta ggc ttg cac ggc tta 
Val Ala Glu Leu Thr Trp Glu Asn Gly Gin Leu Gly Leu His Gly Leu 
40 45 50 

ggt cca ccg cga gtg acg get teg teg acc aag tac tec aca ggc gee 
Gly Pro Pro Arg Val Thr Ala Ser Ser Thr Lys Tyr Ser Thr Gly Ala 
55 60 65 
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ggt gga acg ttg gag teg ata gtg gac caa get act cgc etc cct aac 
Gly Gly Thr Leu Glu Ser lie Val Asp Gin Ala Thr Arg Leu Pro Asn 
70 75 . 80 



298 



cct aag ccc acg gat gag etc gtc ccg tgg ttc cat cat cgc tec tec 
Pro Lys Pro Thr Asp Glu Leu Val Pro Trp Phe His His Arg Ser Ser 
85 90 .95 



346 



agg gee gcg atg gca atg gac gcg ctt gtc cct tgc tec aac eta gta 
Arg Ala Ala Met Ala Met Asp Ala Leu Val Pro Cys Ser Asn Leu Val 
100 105 110 ~ 115 



394 



cac gag cag cag age aag cct ggt ggc gtt ggc tec ace egg gtg ggg 
His Glu Gin Gin Ser Lys Pro Gly Gly Val Gly Ser Thr Arg Val Gly 
120 125 130 



442 



tea tgt age gat ggt cgt ace atg ggc ggt gga aaa cga gca aga gtg 
Ser Cys Ser Asp Gly Arg Thr Met Gly Gly Gly Lys Arg Ala Arg Val 
135 140 145 



490 



gca ccg gag tgg age ggc ggc ggg agt cag egg ctg ace atg gac act 
Ala Pro Glu Trp Ser Gly Gly Gly Ser Gin Arg Leu Thr Met Asp Thr 
150 155 160 



538 



tac gac gta ggt ttc ace tea aca tea atg ggc teg cac gat aac aca 
Tyr Asp Val Gly Phe Thr Ser Thr Ser Met Gly Ser His Asp Asn Thr 
165 170 175 



586 



ate gac gat cat gac tec gtc tgc cac age cgc cca cag atg gag gac 
lie Asp Asp His Asp Ser Val Cys His Ser Arg Pro Gin Met Glu Asp 
180 185 190 195 



634 



gaa gaa gag aag aaa gee gga gga aaa tea tea gtt tea acc aag aga 
Glu Glu Glu Lys Lys Ala Gly Gly Lys Ser Ser Val Ser Thr Lys Arg 
200 205 210 



682 



age aga get get get att cat aac caa tec gaa cgt aag agg aga gat 
Ser Arg Ala Ala Ala lie His Asn Gin Ser Glu Arg Lys Arg Arg Asp 
215 220 225 



730 



aaa ate aat caa agg atg aag act ttg caa aaa ctg gtt ccc aat tec 
Lys He Asn Gin Arg Met Lys Thr Leu Gin Lys Leu Val Pro Asn Ser 
230 235 240 



778 



age aag acg gat aaa gca tct atg ttg gat gaa gtg ata gag tat ttg 
Ser Lys Thr Asp Lys Ala Ser Met Leu Asp Glu Val He Glu Tyr Leu 
245 250 255 



826 



aag caa ctt caa gca caa gtg age atg atg age aga atg aat atg cct 
Lys Gin Leu Gin Ala Gin Val Ser Met Met Ser Arg Met Asn Met Pro 
260 265 270 275 



874 



tct atg atg ctt cct atg gee atg cag caa caa caa caa eta caa atg 
Ser Met Met Leu Pro Met Ala Met Gin Gin Gin Gin Gin Leu Gin Met 
280 285 290 



922 



tct etc atg tec aat ccc atg ggt tta ggg atg ggc atg ggg atg ccc 
Ser Leu Met Ser Asn Pro Met Gly Leu Gly Met Gly Met Gly Met Pro 
295 300 305 



970 



ggt etc ggt etc etc gac ctt aat tct atg aac cga get get gca age 
Gly Leu Gly Leu Leu Asp Leu Asn Ser Met Asn Arg Ala Ala Ala Ser 
310 ~ 315 320 



1018 



get cct aat ate cat gee aac atg atg cca aac cca ttt ttg ccc atg 
Ala Pro Asn He His Ala Asn Met Met Pro Asn Pro Phe Leu Pro Met 
325 330 335 



1066 



aat tgt cca teg tgg gat get tct tec aat gac tct cga ttt cag tct 
Asn Cys Pro Ser Trp Asp Ala Ser Ser Asn Asp Ser Arg Phe Gin Ser 
340 345 350 " 355 



1114 



cct etc ate ccc gat cct atg tct gee ttt ctt gca tgc tct act cag 
Pro Leu He Pro Asp Pro Met Ser Ala Phe Leu Ala Cys Ser Thr Gin 
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MBI-17 Sequence Listing. ST25 
360 365 370 

cca acg acg atg gaa gcg tat age agg atg get aca tta tat cag caa 1210 
Pro Thr Thr Met Glu Ala Tyr Ser Arg Met Ala Thr Leu Tyr Gin Gin 
375 380 385 

atg caa caa caa ctt cct cct cct teg aat cca aaa tga ttattactca 1259 
Met Gin Gin Gin Leu Pro Pro Pro Ser Asn Pro Lys 
390 395 

aacacctcta tatagtttac gtctatatat gtgttagtca catacataca tatatatatt 1319 
ccatcataat tatttattta tatgtatagg cttctcatga attatgatat tataegtatt 1379 



aegtaaaaaa 



<210> 16 

<211> 399 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 16 



Met Ser Gin Cys Val Pro Asn Cys His He Asp Asp Thr Pro Ala Ala 
15 10 15 



Ala Thr Thr Thr Val Arg Ser Thr Thr Ala Ala Asp He Pro He Leu 
20 25 30 

Asp Tyr Glu Val Ala Glu Leu Thr Trp Glu Asn Gly Gin Leu Gly Leu 
35 40 45 

His Gly Leu Gly Pro Pro Arg Val Thr Ala Ser Ser Thr Lys Tyr Ser 
50 55 60 

Thr Gly Ala Gly Gly Thr Leu Glu Ser He Val Asp Gin Ala Thr Arg 
65 ' 70 75 80 

Leu Pro Asn Pro Lys Pro Thr Asp Glu Leu Val Pro Trp Phe His His 
85 90 95 



1389 



Arg Ser Ser Arg Ala Ala Met Ala Met Asp Ala Leu Val Pro Cys Ser 
100 105 110 

Asn Leu Val His Glu Gin Gin Ser Lys Pro Gly Gly Val Gly Ser Thr 
115 120 125 

Arg Val Gly Ser Cys Ser Asp Gly Arg Thr Met Gly Gly Gly Lys Arg 
130 135 140 

Ala Arg Val Ala Pro Glu Trp Ser Gly Gly Gly Ser Gin Arg Leu Thr 
145 150 155 160 

Met Asp Thr Tyr Asp Val Gly Phe Thr Ser Thr Ser Met Gly Ser His 
165 170 175 



Asp Asn Thr He Asp Asp His Asp Ser Val Cys His Ser Arg Pro Gin 
180 185 190 



Met Glu Asp Glu Glu Glu Lys Lys Ala Gly Gly Lys Ser Ser Val Ser 
195 200 205 
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Thr Lys Arg Ser Arg Ala Ala Ala He His Asn Gin Ser Glu Arg Lys 
210 215 220 

Arg Arg Asp Lys He Asn Gin Arg Met Lys Thr Leu Gin Lys Leu Val 
225 230 235 240 

Pro Asn Ser Ser Lys Thr Asp Lys Ala Ser Met Leu Asp Glu Val He 
245 250 255 

Glu Tyr Leu Lys Gin Leu Gin Ala Gin Val Ser Met Met Ser Arg Met 
260 265 270 

Asn Met Pro Ser Met Met Leu Pro Met Ala Met Gin Gin Gin Gin Gin 
275 280 285 

Leu Gin Met Ser Leu Met Ser Asn Pro Met Gly Leu Gly Met Gly Met 
290 295 300 

Gly Met Pro Gly Leu Gly Leu Leu Asp Leu Asn Ser Met Asn Arg Ala 
305 310 315 320 

Ala Ala Ser Ala Pro Asn He His Ala Asn Met Met Pro Asn Pro Phe 
325 330 335 

Leu Pro Met Asn Cys Pro Ser Trp Asp Ala Ser Ser Asn Asp Ser Arg 
340 * 345 350 

Phe Gin Ser Pro Leu He Pro Asp Pro Met Ser Ala Phe Leu Ala Cys 
355 " 360 365 

Ser Thr Gin Pro Thr Thr Met Glu Ala Tyr Ser Arg Met Ala Thr Leu 
370 375 380 

Tyr Gin Gin Met Gin Gin Gin Leu Pro Pro Pro Ser Asn Pro Lys 
385 390 395 

<210> 17 

<211> 1126 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (282) . . (920) 

<223> G865 

<400> 17 

atccccactt gttgttcatc accaagccaa gctccatgtc ctagtcactc cacagattcc 60 

ctatcatcat caattcgttt caaacttagt tcctttcaaa gtcttgtaca tatatacaca 120 

cacacctatt attctcttgg tgtgtttgtg tgttacatat acgtgtgagt acatactttg 180 

ttgtaaaagt ggatcggagg tatggaaagg gaccggttcc accggaaaca tcggcggcgg 240 

cggatgataa ttcgtcttgg aacgagactg atgtcaccgc c atg gtc tec get etc 296 

Met Val Ser Ala Leu 
1 5 

age cgt gtc ata gag aat ccg aca gac ccg ccg gtc aaa caa gag ctt 344 
Ser Arg Val He Glu Asn Pro Thr Asp Pro Pro Val Lys Gin Glu Leu 
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10 15 20 

gat aaa teg gat caa cat caa cca gac caa gat caa cca aga aga aga 392 
Asp Lys Ser Asp Gin His Gin Pro Asp Gin Asp Gin Pro Arg Arg Arg 
25 30 35 

cac tat aga ggc gta agg cag aga cca tgg ggt aaa tgg gcg gca gaa 440 
His Tyr Arg Gly Val Arg Gin Arg Pro Trp Gly Lys Trp Ala Ala Glu 
40 45 50 



ate cgc gat cca aag aaa gca gec cgt gtc tgg etc ggg act ttc gag 
He Arg Asp Pro Lys Lys Ala Ala Arg Val Trp Leu Gly Thr Phe Glu 
55 60 65 



aaa ggc acc aag get aaa ctg aac ttc cct gaa egg gtc caa ggc cct 
Lys Gly Thr Lys Ala Lys Leu Asn Phe Pro Glu Arg Val Gin Gly Pro 
90 95 100 



488 



acg gca gag gaa get get tta gec tat gac cga get gee etc aaa ttc 536 
Thr Ala Glu Glu Ala Ala Leu Ala Tyr Asp Arg Ala Ala Leu Lys Phe 
70 75 80 85 



584 



728 



776 



824 



872 



920 



act acc acc aca acc att tct cat gca cca aga gga gtt agt gaa tec 632 
Thr Thr Thr Thr Thr He Ser His Ala Pro Arg Gly Val Ser Glu Ser 
105 HO 115 

atg aac tea cct cct cct cga cct ggt cca cct tea act act act act 680 
Met Asn Ser Pro Pro Pro Arg Pro Gly Pro Pro Ser Thr Thr Thr Thr 
120 125 130 

teg tgg cca atg act tat aac cag gac ata ctt caa tac get cag ttg 
Ser Trp Pro Met Thr Tyr Asn Gin Asp He Leu Gin Tyr Ala Gin Leu 
135 140 145 

ctt acg agt aac aat gag gtt gat tta tea tac tac acg teg act etc 
Leu Thr Ser Asn Asn Glu Val Asp Leu Ser Tyr Tyr Thr Ser Thr Leu 
150 155 160 165 

ttc agt caa cct ttfe tea acg cct tct tea tct tct tct tec tec caa 
Phe Ser Gin Pro Phe Ser Thr Pro Ser Ser Ser Ser Ser Ser Ser Gin 
170 175 180 

cag acg cag caa cag cag eta caa caa caa caa cag cag cgt gaa gaa 
Gin Thr Gin Gin Gin Gin Leu Gin Gin Gin Gin Gin Gin Arg Glu Glu 
185 190 195 

gaa gag aag aat tat ggt tac aat tat tat aac tac cca aga gaa taa 
Glu Glu Lys Asn Tyr Gly Tyr Asn Tyr Tyr Asn Tyr Pro Arg Glu 
200 205 210 

tctaattatt attgttggtc gaatcagttt tataaatagc tatcatagtt tcatttttgg 980 

tttcegtaac ctttgttgca tggaaaatat gaatgaacga gggacatgtg taacaatttg 1040 

tttgtgtttc gtaaatgtta gttgtatttg gatttgetga agtttgattt tctgagcata 1100 

aatcatttga eggtcaaaaa aaaaaa I 126 

<210> 18 
<2li> 212 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 18 

Met Val Ser Ala Leu Ser Arg Val He Glu Asn Pro Thr Asp Pro Pro 
15 10 15 

Val Lys Gin Glu Leu Asp Lys Ser Asp Gin His Gin Pro Asp Gin Asp 
20 25 3 0 

Gin Pro Arg Arg Arg His Tyr Arg Gly Val Arg Gin Arg Pro Trp Gly 
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35 40 45 



Lys Trp Ala Ala Glu lie Arg Asp Pro Lys Lys Ala Ala Arg Val Trp 
50 55 60 



Leu Gly Thr Phe Glu Thr Ala Glu Glu Ala Ala Leu Ala Tyr Asp Arg 
65 70 75 "80 



Ala Ala Leu Lys Phe Lys Gly Thr Lys Ala Lys Leu Asn Phe Pro Glu 
85 90 95 



Arg Val Gin Gly Pro Thr Thr Thr Thr Thr lie Ser His Ala Pro Arg 
100 105 110 



Gly Val Ser Glu Ser Met Asn Ser Pro Pro Pro Arg Pro Gly Pro Pro 
115 120 125 



Ser Thr Thr Thr Thr Ser Trp Pro Met Thr Tyr Asn Gin Asp lie Leu 
130 135 140 



Gin Tyr Ala Gin Leu Leu Thr Ser Asn Asn Glu Val Asp Leu Ser Tyr 
145 150 155 160 



Tyr Thr Ser Thr Leu Phe Ser Gin Pro Phe Ser Thr Pro Ser Ser Ser 
165 170 175 



Ser Ser Ser Ser Gin Gin Thr Gin Gin Gin Gin Leu Gin Gin Gin Gin 
180 185 190 



Gin Gin Arg Glu Glu Glu Glu Lys Asn Tyr Gly Tyr Asn Tyr Tyr Asn 
195 200 205 



Tyr Pro Arg Glu 
210 



<210> 


19 


<211> 


1571 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(428) . . (1402) 


<223> 


G869 


<400> 


19 



aggaacagtg aaaggttcgg ttttttgggt ttcgatctga taatcaacaa gaaaaaaggg 60 

tttgatttat gtcggctggg tttgaatcga ctgtgatttt gtctttgatt catatctctt 120 

ctccgatttc atcatcatct tccccatcat cgtcgtcttt gaaatcttgt cttctcaacg 180 

ctcttcactt ctgctgtaat aagcagaggc ttgttctgga gactccttct ctttccatgc 240 

gcttaagacc caaaaggact tgttctagtg ttgaagtctt tgggggtttt cacataaagc 300 

agcaaaagtt ttcttttttc atagttcgct gagagttttg agttttgata ccaaaaaagt 360 

tttgaccttt tagagtgatt ttttgttctt tctgttttct gggtattttt gaggagtggg 420 

tttaaca atg gtt gcg att aga aag gaa cag tct ttg agt ggt gtt agt 469 
Met Val Ala lie Arg Lys Glu Gin Ser Leu Ser Gly Val Ser 
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.1 5 10 

age gag att aag aag aga get aag aga aac act eta teg tec ctt cct 517 
Ser Glu lie Lys Lys Arg Ala Lys Arg Asn Thr Leu Ser Ser Leu Pro 
15 - 20 25 . 30 

caa gaa acc caa cct ttg agg aaa gtc cgt att att gtg aat gat cct 565 
Gin Glu Thr Gin Pro Leu Arg Lys Val Arg lie lie Val Asn Asp Pro 
35 40 45 

tat get act gat gat tec tct agt gat gag gaa gag ctt aag gtt cct 613 
Tyr Ala Thr Asp Asp Ser Ser Ser Asp Glu Glu Glu Leu Lys Val Pro 
50 55 .60 

aag cca agg aaa atg aaa cgt ate gtt cgt gag att aac ttt cct tct 661 
Lys Pro Arg Lys Met Lys Arg lie Val Arg Glu lie Asn Phe Pro Ser 
65 70 75 



atg gaa gtt tct gaa cag cct tct gag agt tct tct cag gac agt act 
Met Glu Val Ser Glu Gin Pro Ser Glu Ser Ser Ser Gin Asp Ser Thr 
60 85 90 



709 



853 



901 



aaa act gat ggc aag ata get gtg tea get tct cct get gtt cct agg 757 
Lys Thr Asp Gly Lys lie Ala Val Ser Ala Ser Pro Ala Val Pro Arg 
95 100 105 110 

aag aag cct gtt ggt gtt agg caa agg aaa tgg ggg aaa tgg get get 805 
Lys Lys Pro Val Gly Val Arg Gin Arg Lys Trp Gly Lys Trp Ala Ala 
115 120 125 

gag att aga gat cct att aag aaa act agg act tgg ttg ggt act ttt 
Glu He Arg Asp Pro He Lys Lys Thr Arg Thr Trp Leu Gly Thr Phe 
130 135 140 

gat act ctt gaa gaa get get aaa get tat gat get aag aag ctt gag 
Asp Thr Leu Glu Glu Ala Ala Lys Ala Tyr Asp Ala Lys Lys Leu Glu 
* 145 150 155 

ttt gat get att gtt get gga aat gtg tec act act aaa cgt gat gtt 949 
Phe Asp Ala He Val Ala Gly Asn Val Ser Thr Thr Lys Arg Asp Val 
.160 165 170 

tct tea tct gag act age caa tgc tct cgt tct tea cct gtt gtt cct 997 
Ser Ser Ser Glu Thr Ser Gin Cys Ser Arg Ser Ser Pro Val Val Pro 
175 180 185 190 

gtt gag caa gat gac act tct gca tea get etc act tgt gtc aac aac 1045 
Val Glu Gin Asp Asp Thr Ser Ala Ser Ala Leu Thr Cys Val Asn Asn 
195 200 205 

cct gat gac gtc teg acc gtt get cca act get cca act cca aat gtt 1093 
Pro Asp Asp Val Ser Thr Val Ala Pro Thr Ala Pro Thr Pro Asn Val 
210 215 220 

cct get ggt gga aac aag gaa acg ttg ttc gat ttc gac ttt act aat 1141 
Pro Ala Gly Gly Asn Lys Glu Thr Leu Phe Asp Phe Asp Phe Thr Asn 
225 230 235 

eta cag ate cct gat ttt ggt ttc ttg gca gag gag caa caa gac eta 1189 
Leu Gin He Pro Asp Phe Gly Phe Leu Ala Glu Glu Gin Gin Asp Leu 
240 245 250 

gac ttc gat tgt ttc etc gcg gat gat cag ttt gat gat ttc ggc ttg 1237 
Asp Phe Asp Cys Phe Leu Ala Asp Asp Gin Phe Asp Asp Phe Gly Leu 
255 * 260 265 270 

ctt gat gac att caa gga ttc gaa gat aac ggt cca agt gcg tta cca 1285 
Leu Asp Asp He Gin Gly Phe Glu Asp Asn Gly Pro Ser Ala Leu Pro 
275 280 285 

gat ttc gac ttt gcg gat gtt gaa gat ctt cag eta get gac tct agt 1333 
Asp Phe Asp Phe Ala Asp Val Glu Asp Leu Gin Leu Ala Asp Ser Ser 
290 295 300 

ttc ggt ttc ctt gat caa ctt get cct ate aac ate tct tgc cca tta 1381 
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Phe Gly Phe Leu Asp Gin Leu Ala Pro He Asn He Ser Cys Pro Leu 
305 310 315 

aaa agt ttt gca get tea tag gatcttgett agtaatgtta agtgagaaga 1432 
Lys Ser Phe Ala Ala Ser 
320 

gtgttttgtt ttttcgttta tgctttagta atttaagaca tacaaaagtg tgtgttccgg 1492 

attgtagtaa gatcttaaga cataaagecg ggttttgcaa ttaggaatcg agttttaatg 1552 

aagttttagt ttatgtttg 1571 

<210> 20 
<211> 324 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 20 

Met Val Ala He Arg Lys Glu Gin Ser Leu Ser Gly Val Ser Ser Glu 
15 10 15 

He Lys Lys Arg Ala Lys Arg Asn Thr Leu Ser Ser Leu Pro Gin Glu 
20 ~ " 25 30 

Thr Gin Pro Leu Arg Lys Val Arg lie lie Val Asn Asp Pro Tyr Ala 
35 40 45 

Thr Asp Asp Ser Ser Ser Asp Glu Glu Glu Leu Lys Val Pro Lys Pro 
50 55 60 

Arg Lys Met Lys Arg lie Val Arg Glu lie Asn Phe Pro Ser Met Glu 
65 70 75 80 

val Ser Glu Gin Pro Ser Glu Ser Ser Ser Gin Asp Ser Thr Lys Thr 
85 90 95 

Asp Gly Lys lie Ala Val Ser Ala Ser Pro Ala Val Pro Arg Lys Lys 
100 105 110 

Pro Val Gly Val Arg Gin Arg Lys Trp Gly Lys Trp Ala Ala Glu He 
115 ~ 120 125 

Arg Asp Pro lie Lys Lys Thr Arg Thr Trp Leu Gly Thr Phe Asp Thr 
130 " 135 140 

Leu Glu Glu Ala Ala Lys Ala Tyr Asp Ala Lys Lys Leu Glu Phe Asp 
145 150 155 160 

Ala He Val Ala Gly Asn Val Ser Thr Thr Lys Arg Asp Val Ser Ser 
165 170 175 

Ser Glu Thr Ser Gin Cys Ser Arg Ser Ser Pro Val Val Pro Val Glu 
180 185 190 

Gin Asp Asp Thr Ser Ala Ser Ala Leu Thr Cys Val Asn Asn Pro Asp 
195 200 205 

Asp Val Ser Thr Val Ala Pro Thr Ala Pro Thr Pro Asn Val Pro Ala 
210 215 220 
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Gly Gly Asn Lys Glu Thr Leu Phe Asp Phe Asp Phe Thr Asn Leu Gin 
225 ' 230 235 240 

lie Pro Asp Phe Gly Phe Leu Ala Glu Glu Gin Gin Asp Leu Asp Phe 
245 250 255 

Asp Cys Phe Leu Ala Asp Asp Gin Phe Asp Asp Phe Gly Leu Leu Asp 
260 265 270 

Asp He Gin Gly Phe Glu Asp Asn Gly Pro Ser Ala Leu Pro Asp Phe 
.275 280 285 

Asp Phe Ala Asp Val Glu Asp Leu Gin Leu Ala Asp Ser Ser Phe Gly 
290 295 300 

Phe Leu Asp Gin Leu Ala Pro He Asn He Ser Cys Pro Leu Lys Ser 
305 310 315 320 



Phe Ala Ala Ser 



<210> 21 

<211> 1195 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (67) ... (1041) 

<223> G883 ' 



<400> 21 

ctctctcgtc ttcgtcttct tcttcttcaa cgttcctctc caaaatcctc agaccaagaa. 

atcatc atg gcc gtc gat eta atg cgt ttc cct aag ata gat gat caa 
Met Ala Val Asp Leu Met Arg Phe Pro Lys He Asp Asp Gin 
15 10 



cac tec act tec tct gcc gca tct cag aaa eta cag agt cag ate gtt 
His Ser Thr Ser Ser Ala Ala Ser Gin Lys Leu Gin Ser Gin He Val 

• n i- an 



80 85 90 



60 
108 



acg get att cag gaa get gca teg caa ggt tta caa agt atg gaa cat 156 
Thr Ala He Gin Glu Ala Ala Ser Gin Gly Leu Gin Ser Met Glu His 
15 20 25 30 



204 



ctg ate cgt gtc etc tct aac cgt ccc gaa caa caa cac aac gtt gac 

Leu lie Arg Val Leu Ser Asn Arg Pro Glu Gin Gin His Asn Val Asp 

35 40 45 

tgc tec gag ate act gac ttc acc gtt tct aaa ttc aaa ace gtc att 252 

Cys Ser Glu He Thr Asp Phe Thr Val Ser Lys Phe Lys Thr Val He 
50 55 60 

tct etc ctt aac cgt act ggt cac get egg ttc aga cgc gga ccg gtt 300 

Ser Leu Leu Asn Arg Thr Gly His Ala Arg Phe Arg Arg Gly Pro Val 

65 70 75 



348 



aaa aat act caa cct gag get ccg ata gtg aga aca act acg aat cac 396 

Lys Asn Thr Gin Pro Glu Ala Pro He Val Arg Thr Thr Thr Asn His 

95 100 105 HO 

cct caa ate gtt cct cca ccg tct agt gta aca etc gat ttc tct aaa 444 

Pro Gin He Val Pro Pro Pro Ser Ser Val Thr Leu Asp Phe Ser Lys 
115 120 125 
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cca age ate ttc ggc acc aaa get aag age gec gag ctg gaa ttc tec 492 
Pro Ser lie Phe Gly Thr Lys Ala Lys Ser Ala Glu Leu Glu Phe Ser 
130 135 .140 

aaa gaa aac ttc agt gtt tct tta aac tec tea ttc atg teg teg gcg 540 
Lys Glu Asn Phe Ser, Val Ser Leu Asn Ser Ser Phe Met Ser Ser Ala 
145 150 155 

ata acc gga gac ggc age gtc tec aat gga aaa ate ttc ctt get tct 588 
He Thr Gly Asp Gly Ser Val Ser Asn Gly Lys He Phe Leu Ala Ser. 
160 165 170 

get ccg teg cag cct gtt aac tct tec gga aaa cca ccg ttg get ggt 636 
Ala Pro Ser Gin Pro Val Asn Ser Ser Gly Lys Pro Pro Leu Ala Gly 
175 180 185 190 

cat cct tac aga aag aga tgt etc gag cat gag cac tea gag agt ttc 684 
His Pro Tyr Arg Lys Arg Cys Leu Glu His Glu His Ser Glu Ser Phe. 

195 200 205 

tec gga aaa gtc tec ggc tec gec tac gga aag tgc cat tgc aag aaa 732 
Ser Gly Lys Val Ser Gly Ser Ala Tyr Gly Lys Cys His Cys Lys Lys 
210 215 . 220 

agg aaa aat egg atg aag aga acc gtg aga gta ccg gcg ata agt gca 780 
Arg Lys Asn Arg Met Lys Arg Thr Val Arg Val Pro Ala He Ser Ala 
225 230 235 

aag ate gee gat att cca ccg gac gaa tat teg tgg agg aag tac gga 828 
Lys lie Ala Asp He Pro Pro Asp Glu Tyr Ser Trp Arg Lys Tyr Gly 
240 245 250 

caa aaa ccg ate aag ggc tea cca cac cca cgt ggt tac tac aag tgc 876 
Gin Lys Pro He Lys Gly Ser Pro His Pro Arg Gly Tyr Tyr Lys Cys 
255 260 265 270 



agt aca ttc aga ggia tgt cca gcg agg aaa cac gtg gaa cga gca tta 
Ser Thr Phe Arg Gly Cys Pro Ala Arg Lys His Val Glu Arg Ala Leu 
275 280 285 

gat gat cca gcg atg ctt att gtg aca tac gaa gga gag cac cgt cat 
Asp Asp Pro Ala Met Leu He Val Thr Tyr Glu Gly Glu His Arg His 
290 295 300 



924 



972 



aac caa tec gcg atg cag gag aat att tct tct tea ggc att aat gat 
Asn Gin Ser Ala Met Gin Glu Asn He Ser Ser Ser Gly He Asn Asp 
305 310 315 



1020 



tta gtg ttt gee teg get tga cttttttttg tactatttgt tttttgattt 
Leu Val Phe Ala Ser Ala 
320 



1071 



tttgagtact ttagatggat tgaaatttgt aaattttttt attaagaaat caatttaaat 1131 
agagaaaaat tagtggtggt gcaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1191 
aaaa 1195 

<210> 22 
<211> 324 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 22 

Met Ala Val Asp Leu Met Arg Phe Pro Lys He Asp Asp Gin Thr Ala 
15 10 15 

He Gin Glu Ala Ala Ser Gin Gly Leu Gin Ser Met Glu His Leu He 
20 25 30 
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Arg Val Leu Ser Asn Arg Pro Glu Gin Gin His Asn Val Asp Cys Ser 
35 40 45 

Glu He Thr Asp Phe Thr Val Ser Lys Phe Lys Thr Val He Ser Leu 
50 55 60 

Leu Asn Arg Thr Gly His Ala Arg Phe Arg Arg Gly Pro Val His Ser 
65 70 75 80 

Thr Ser Ser Ala Ala Ser Gin Lys Leu Gin Ser Gin He Val Lys Asn 
85 90 .95 

Thr Gin Pro Glu Ala Pro He Val Arg Thr Thr Thr Asn His Pro Gin 
100 105 110 



He Val Pro Pro Pro Ser Ser Val Thr Leu Asp Phe Ser Lys Pro Ser 
115 120 125 

He Phe Gly Thr Lys Ala Lys Ser Ala Glu Leu Glu Phe Ser Lys Glu 
130 135 140 

Asn Phe Ser Val Ser Leu Asn Ser Ser Phe Met Ser Ser Ala He Thr 
145 150 155 160 

Gly Asp Gly Ser Val Ser Asn Gly Lys He Phe Leu Ala Ser Ala Pro 
165 170 175 

Ser Gin Pro. Val Asn Ser Ser Gly Lys Pro Pro Leu Ala Gly His Pro 
180 185 190 

Tyr Arg Lys Arg Cys Leu Glu His Glu His Ser Glu Ser Phe Ser Gly 
195 200 205 

Lys Val Ser Gly Ser Ala Tyr Gly Lys Cys His Cys Lys Lys Arg Lys 
210 " 215 220 

Asn Arg Met Lys Arg Thr Val Arg Val Pro Ala He Ser Ala Lys He 
225 230 235 240 

Ala Asp He Pro Pro Asp Glu Tyr Ser Trp Arg Lys Tyr Gly Gin Lys 
245 250 255 

Pro He Lys Gly Ser Pro His Pro Arg Gly Tyr Tyr Lys Cys Ser Thr 
260 265 270 

Phe Arg Gly Cys Pro Ala Arg Lys His Val Glu Arg Ala Leu Asp Asp 
275 280 285' 

Pro Ala Met Leu He Val Thr Tyr Glu Gly Glu His Arg His Asn Gin 
290 295 300 

Ser Ala Met Gin Glu Asn He Ser Ser Ser Gly He Asn Asp Leu Val 
305 310 315 320 



Phe Ala Ser Ala 
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<210> 


23 


<211> 


1755 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<221> 


CDS 


<222> 


(1) (1755) 


<223> 


G938 


<400> 


23 



atg atg atg ttt aac gag atg gga atg tat gga aac atg gat ttc ttc 48 
Met Met Met Phe Asn Glu Met Gly Met Tyr Gly Asn Met Asp Phe Phe 
1 5 10 15 

tct tec tec aca tct etc gat gtg tgt cca tta cca caa get gaa caa . 96 
Ser Ser Ser Thr Ser Leu Asp Val Cys Pro Leu Pro Gin Ala Glu Gin 
20 25 30 

gaa cct gta gtt gaa gat gtc gac tac ace gat gat gag atg gat gtg 144 
Glu Pro val Val Glu Asp Val Asp Tyr Thr Asp Asp Glu Met Asp Val 
35 40 45 

gat gag ctt gag aag agg atg tgg aga gac aaa atg cgt ttg aaa cgt 192 
Asp Glu Leu Glu Lys Arg Met Trp Arg Asp Lys Met Arg Leu Lys Arg 
50 55 60 

etc aag .gag caa cag agt aag tgt aaa gaa ggc gtc gat ggt teg aaa 240 
Leu Lys Glu Gin Gin Ser Lys Cys Lys Glu Gly Val Asp Gly Ser Lys 
65 70 75 80 

cag agg cag teg caa gag caa get agg agg aag aaa atg tct aga gee 288 
Gin Arg Gin Ser Glh Glu Gin Ala Arg Arg Lys Lys Met Ser Arg Ala 
85 90 95 

caa gat ggg ate ttg aag tat atg ttg aag atg atg gaa gtt tgt aaa 336 
Gin Asp Gly lie Leu Lys Tyr Met Leu Lys Met Met Glu Val Cys Lys 
100 105 110 " 

get caa ggc ttt gtt tat ggt att att cct gag aag ggt aag cct gtg 384 
Ala Gin Gly Phe Val Tyr Gly He He Pro Glu Lys Gly Lys Pro Val 
115 120 125 

act ggt get teg gat aat ttg agg gaa tgg tgg aaa gat aag gtt agg 432 
Thr Gly Ala Ser Asp Asn Leu Arg Glu Trp Trp Lys Asp Lys Val Arg 
130 135 140 

ttt gat cgt aat ggt cca get get att get aag tat cag tea gag aat 4 80 

Phe Asp Arg Asn Gly Pro Ala Ala He Ala Lys Tyr Gin Ser Glu Asn 
145 150 155 160 

aat att tct gga ggg agt aat gat tgt aac age ttg gtt ggt cca aca 528 
Asn He Ser Gly Gly Ser Asn Asp Cys Asn Ser Leu Val Gly Pro Thr 
165 170 175 

ccg cat acg ctt cag gag ctt cag gac acg act ctt ggt teg ctt tta 576 
Pro His Thr Leu Gin Glu Leu Gin Asp Thr Thr Leu Gly Ser Leu Leu 
180 185 190 

teg get ttg atg caa cat tgt gat cca ccg cag aga egg ttt cct ttg 624 
Ser Ala Leu Met Gin His Cys Asp Pro Pro Gin Arg Arg Phe Pro Leu 
195 200 205 

gag aaa gga gtt tct cca cct tgg tgg cct aat ggg aat gaa gag tgg 672 
Glu Lys Gly Val Ser Pro Pro Trp Trp Pro Asn Gly Asn Glu Glu Trp 
210 215 * 220 

tgg cct cag ctt ggt tta cca aat gag caa ggt cct cct cct tat aag 720 
Trp Pro Gin Leu Gly Leu Pro Asn Glu Gin Gly Pro Pro Pro Tyr Lys 
225 230 235 240 

aag cct cat gat ttg aag aaa get tgg aaa gtc ggt gtt tta act gcg 768 
Lys Pro His Asp Leu Lys Lys Ala Trp Lys Val Gly Val Leu Thr Ala 
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245 



MBI-17 Sequence Listing. ST25 

250 255 



gtg ate aag cat atg teg ccg gat att gcg aag ate cgt aag ctt gtg 
Val lie Lys His Met Ser Pro Asp lie Ala Lys lie Arg Lys Leu Val 
260 265 270 

agg caa tea aaa tgc ttg cag gat aag atg acg gcg aaa gag agt get 
Arg Gin Ser Lys Cys Leu Gin Asp Lys Met Thr Ala Lys Glu Ser Ala 
275 280 285 

act tgg ctt gec att att aac caa gaa gag gtt gtg get egg gag ctt 
Thr Trp Leu Ala lie lie Asn Gin Glu Glu Val Val Ala Arg Glu Leu 
290 295 300 

tat ccc gag tea tgc cct cct ctt tct tct tct tea tea tta gga age 
Tyr Pro Glu Ser Cys Pro Pro Leu Ser Ser Ser Ser Ser Leu Gly Ser 
305 310 315 320 

999 teg ctt etc att aat gat tgt age gag tat gac gtt gaa ggt ttc 
Gly Ser Leu Leu He Asn Asp Cys Ser Glu Tyr Asp Val Glu Gly Phe 
325 330 ( 335 

gag aag gaa caa cat ggt ttc gat gtg gaa gag egg aaa cca gag ata 
Glu Lys Glu Gin His Gly Phe Asp Val Glu Glu Arg Lys Pro Glu He 
340 345 350 

gtg atg atg . cat cct eta gca age ttt ggg gtt get aaa atg caa cat 
Val Met Met His Pro Leu Ala Ser Phe Gly Val Ala Lys Met Gin His 
355 360 365 

ttt ccc ata aag gag gag gtc gec acc acg gta aac tta gag ttc acg 
Phe Pro He Lys Glu Glu Val Ala Thr Thr Val Asn Leu Glu Phe Thr 
370 375 380 

aga aag agg aag cag aac aat gat atg aat gtt atg gta atg gac aga 
Arg Lys Arg Lys Gin Asn Asn Asp Met Asn Val Met Val Met Asp Arg 
385 ~ 390 395 400 

tea gca ggt tac act tgt gag aat ggt cag tgt cct cac age aaa atg 
Ser Ala Gly Tyr. Thr Cys Glu Asn Gly Gin Cys Pro His Ser Lys Met 
405 410 415 

aat ctt gga ttt caa gac agg agt tea agg gac aac cac cag atg gtt 
Asn Leu Gly Phe Gin Asp Arg Ser Ser Arg Asp Asn His Gin Met Val 
420 425 430 

tgt cca tat aga gac aat cgt tta gcg tat gga gca tec aag ttt cat 
Cys Pro Tyr Arg Asp Asn Arg Leu Ala Tyr Gly Ala Ser Lys Phe His 
435 440 445 

atg ggt gga atg aaa eta gta gtt cct cag caa cca gtc caa ccg ate 
Met Gly Gly Met Lys Leu Val Val Pro Gin Gin Pro Val Gin Pro He 
450 * 455 460 

gac eta teg ggc gtt gga gtt ccg gaa aac ggg cag aag atg ate acc 
Asp Leu Ser Gly Val Gly Val Pro Glu Asn Gly Gin Lys Met He Thr 
465 470 475 480 

gag ctt atg gec atg tac gac aga aat gtc caa age aac caa acg cct 
Glu Leu Met Ala Met Tyr Asp Arg Asn Val Gin Ser Asn Gin Thr Pro 
485 490 495 

cct act ttg atg gaa aac caa age atg gtc att gat gca aaa gca get 
Pro Thr Leu Met Glu Asn Gin Ser Met Val He Asp Ala Lys Ala Ala 
500 505 510 

cag aat cag cag ctg aat ttc aac agt ggc aat caa atg ttt atg caa 
Gin Asn Gin Gin Leu Asn Phe Asn Ser Gly Asn Gin Met Phe Met Gin 
515 520 525 

caa ggg acg aac aac ggg gtt aac aat egg ttc cag atg gtg ttt gat 
Gin Gly Thr Asn Asn Gly Val Asn Asn Arg Phe Gin Met Val Phe Asp 
530 535 540 

teg aca cca ttc gat atg gca gca ttc gat tac aga gat gat tgg caa 
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912 



960 



1008 



1056 



1104 



1152 



1200 



1248 
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1344 



1392 



1440 



1488 



1536 



1584 



1632 



1680 
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Ser Thr Pro Phe Asp Met Ala Ala Phe Asp Tyr Arg Asp Asp Trp Gin 
545 550 555 " 560 

acc gga gca atg.gaa gga atg ggg aag cag cag cag cag cag cag cag 1728 
Thr Gly Ala Met Glu Gly Met Gly Lys Gin Gin Gin Gin Gin Gin Gin 
565 570 575 

cag caa gat gta tea ata tgg ttc tga 1755 
Gin Gin Asp Val Ser He Trp Phe 
580 

<210> 24 
<211> 584 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 24 

Met Met Met Phe Asn Glu Met Gly Met Tyr Gly Asn Met Asp Phe Phe 
15 10 15 

Ser Ser Ser Thr Ser Leu Asp Val Cys Pro Leu Pro Gin Ala Glu Gin 
20 ' 25 30 

Glu Pro Val Val Glu Asp Val Asp Tyr Thr Asp Asp Glu Met Asp Val 
35 .40 45 

Asp Glu Leu Glu Lys Arg Met Trp Arg Asp Lys Met Arg Leu Lys Arg 
50 55 60 

Leu Lys Glu Gin Gin Ser Lys Cys Lys Glu Gly Val Asp Gly Ser Lys 
65 70 75 80 

Gin Arg Gin Ser Gin Glu Gin Ala Arg Arg Lys Lys Met Ser Arg Ala 
85 90 95 

Gin Asp Gly He Leu Lys Tyr Met Leu Lys Met Met Glu Val Cys Lys 
100 105 110 

Ala Gin Gly Phe Val Tyr Gly He He Pro Glu Lys Gly Lys Pro Val 
115 120 125 

Thr Gly Ala Ser Asp Asn Leu Arg Glu Trp Trp Lys Asp Lys Val Arg 
130 135 140 

Phe Asp Arg Asn Gly Pro Ala Ala He Ala Lys Tyr Gin Ser Glu Asn 
145 150 155 160 

Asn He Ser Gly Gly Ser Asn Asp Cys Asn Ser Leu Val Gly Pro Thr 
165 170 175 

Pro His Thr Leu Gin Glu Leu Gin Asp Thr Thr Leu Gly Ser Leu Leu 
180 185 190 

Ser Ala Leu Met Gin His Cys Asp Pro Pro Gin Arg Arg Phe Pro Leu 
195 200 205 

Glu Lys Gly Val Ser Pro Pro Trp Trp Pro Asn Gly Asn Glu Glu Trp 
210 ~ 215 ~ 220 
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Trp Pro Gin Leu Gly Leu Pro Asn Glu Gin Gly Pro Pro Pro Tyr Lys 
225 230 • . 235 240 

Lys Pro His Asp Leu Lys Lys Ala Trp Lys Val Gly Val Leu Thr Ala 
245 250 . 255 

Val .lie Lys His Met Ser Pro Asp He Ala Lys He Arg Lys Leu Val 
260 265 270 

Arg Gin Ser Lys Cys Leu Gin Asp Lys Met Thr Ala Lys Glu Ser Ala 
275 280 285 

Thr Trp Leu Ala He He Asn Gin Glu Glu Val Val Ala Arg Glu Leu 
290 295 . 300 

Tyr Pro Glu Ser Cys Pro Pro Leu Ser Ser Ser Ser Ser Leu Gly Ser 
305 310 315 . 320 

Gly Ser Leu Leu He Asn Asp Cys Ser Glu Tyr Asp Val Glu Gly Phe 
325 330 335 

Glu Lys Glu Gin His Gly Phe Asp Val Glu Glu Arg Lys Pro Glu He 
340 345 350 

Val Met Met His Pro Leu Ala Ser Phe Gly Val Ala Lys Met Gin His 
355 360 365 

Phe Pro He Lys Glu Glu Val Ala Thr Thr Val Asn Leu Glu Phe Thr 
370 375 380 

Arg Lys Arg Lys Gin Asn Asn Asp Met Asn Val Met Val Met Asp Arg 
385 *" 390 395 400 

Ser Ala Gly Tyr Thr Cys Glu Asn Gly Gin Cys Pro His Ser Lys Met 
405 410 415 

Asn Leu Gly Phe Gin Asp Arg Ser Ser Arg Asp Asn His Gin Met Val 
420 425 430 

Cvs Pro Tyr Arg Asp Asn Arg Leu Ala Tyr Gly Ala Ser Lys Phe His 
435 440 445 

Met Gly Gly Met Lys Leu Val Val Pro Gin Gin Pro Val Gin Pro He 
450 455 460 

Asp Leu Ser Gly Val Gly Val Pro Glu Asn Gly Gin Lys Met He Thr 
465 ' 470 475 480 

Glu Leu Met Ala Met Tyr Asp Arg Asn Val Gin Ser Asn Gin Thr Pro 
485 490 495 

Pro Thr Leu Met Glu Asn Gin Ser Met Val He Asp Ala Lys Ala Ala 
500 505 510 

Gin Asn Gin Gin Leu Asn Phe Asn Ser Gly Asn Gin Met Phe Met Gin 
515 520 525 
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Gin Gly Thr Asn Asn Gly Val Asn Asn Arg Phe Gin Met Val Phe Asp 
530 535 540 



Ser Thr Pro Phe Asp Met Ala Ala Phe Asp Tyr Arg Asp Asp Trp Gin 
545 550 555 560 



Thr Gly Ala Met Glu Gly Met Gly Lys Gin Gin Gin Gin Gin Gin Gin 
565 570 575 



Gin Gin Asp Val Ser lie Trp Phe 
580 



<210> 


25 


<211> 


1161 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(67) . . (1041) 


<223> 


G1328 


<400> 


25 



aattcaatca ctatattttt ttaaaaacat ttgacttcat cgatcggtta acaattaatc 60 

aaaaag atg gga cga tea cca tgt tgt gag aag aag aat ggt etc aag 108 
Met Gly Arg Ser Pro Cys Cys Glu Lys Lys Asn Gly Leu Lys 
15 10 

aaa gga cca tgg act cct gag gag gat caa aag etc att gat tat ate 156 
Lys Gly Pro Trp Thr Pro Glu Glu Asp Gin Lys Leu lie Asp Tyr lie 
15 20 25 30 

aat ata cat ggt tat gga aat tgg aga act ctt ccc aag aat get ggg 204 
Asn lie His Gly Tyr Gly Asn Trp Arg Thr Leu Pro Lys Asn Ala Gly 
35 40 45 

tta caa aga tgt ggt aag agt tgt cgt etc egg tgg ace aac tat etc 252 
Leu Gin Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu 
50 55 60 

cga cca gat att aag cgt gga aga ttc tct ttt gaa gaa gaa gaa ace 300 
Arg Pro Asp lie Lys Arg Gly Arg Phe Ser Phe Glu Glu Glu Glu Thr 
65 70 75 

att att caa ctt cac age ate atg gga aac aag tgg tct gcg att gcg 348 
lie lie Gin Leu His Ser lie Met Gly Asn Lys Trp Ser Ala lie Ala 
80 85 90 

get cgt ttg cct gga aga aca gac aac gag ate aaa aac tat tgg aac 3 96 

Ala Arg Leu Pro Gly Arg Thr Asp Asn Glu lie Lys Asn Tyr Trp Asn 
95 100 105 ' ~ 110 

act cac ate aga aaa aga ctt eta aag atg gga ate gac ccg gtt aca 444 
Thr His He Arg Lys Arg Leu Leu Lys Met Gly He Asp Pro Val Thr 
115 120 125 

cac act cca cgt ctt gat ctt etc gat ate tec tec att etc age tea 492 
His Thr Pro Arg Leu Asp Leu Leu Asp He Ser Ser He Leu Ser Ser 
130 135 140 

tct ate tac aac tct teg cat cat cat cat cat cat cat caa caa cat 540 
Ser He Tyr Asn Ser Ser His His His His His His His Gin Gin His 
145 150 155 

atg aac atg teg agg etc atg atg agt gat ggt aat cat caa cca ttg 588 
Met Asn Met Ser Arg Leu Met Met Ser Asp Gly Asn His Gin Pro Leu 
160 165 170 
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gtt aac ccc gag ata etc aaa etc gca acc tct etc ttt tea aac caa 

Val Asn Pro Glu lie Leu Lys Leu Ala Thr Ser Leu Phe Ser Asn Gin 

175 . 180 185 190 



636 



aac cac ccc aac aac aca cac gag aac aac acg gtt aac caa acc gaa 684 
Asn His Pro Asn Asn Thr His Glu Asn Asn Thr Val Asn Gin Thr Glu 
195 200 205 

gta aac caa tac caa acc ggt tac aac atg cct ggt aat gaa gaa tta 732 
Val Asn Gin Tyr Gin Thr Gly Tyr Asn Met Pro Gly Asn Glu Glu Leu 
210 215 220 

caa tct tgg. ttc cct ate atg gat caa ttc acg aat ttc caa gac etc 780 
Gin Ser Trp Phe Pro He Met Asp Gin Phe Thr Asn. Phe Gin Asp Leu 
225 230 235 

atg cca atg aag acg acg gtc caa aat tea ttg tea tac gat gat gat 828 
Met Pro Met Lys Thr Thr Val Gin Asn Ser Leu Ser Tyr Asp Asp Asp 
240 245 250 

tgt teg aag tec aat ttt gta tta gaa cct tat tac tec gac ttt get 
Cys Ser Lys Ser Asn Phe Val Leu Glu Pro Tyr Tyr Ser Asp Phe Ala 
255 ** 260 265 270 

tea gtc ttg acc aca cct tct tea age ccg act ccg tta aac tea agt 
Ser Val Leu Thr Thr Pro Ser Ser Ser Pro Thr Pro Leu Asn Ser Ser 
275 280 285 

tec tea act tac ate aat agt age act tgc age acc gag gat gaa aaa 972 
Ser Ser Thr Tyr He Asn Ser Ser Thr Cys Ser Thr Glu Asp Glu Lys 
290 295 300 



gag agt tat tac agt gat aat ate act aat tat teg ttt gat gtt aat 
Glu Ser Tyr Tyr Ser Asp Asn He Thr Asn Tyr Ser Phe Asp Val Asn 
305 " 310 315 



876 



924 



1020 



ggt ttt etc caa ttc caa taa acaaaacgcc attggaatag agttatgtaa 1071 
Gly Phe Leu Gin Phe Gin 
320 

acatgeaate attgtatttg ttatatagat tttgttacat atccaaaatc caaaatacta 1131 
tagttttaaa ataaaaaaaa aaaaaaaaaa 1161 

<210> 26 
<211> 324 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 26 

Met Gly Arg Ser Pro Cys Cys Glu Lys Lys Asn Gly Leu Lys Lys Gly 
15 10 15 

Pro Trp Thr Pro Glu Glu Asp Gin Lys Leu He Asp Tyr He Asn He 
20 25 30 

t 

His Gly Tyr Gly Asn Trp Arg Thr Leu Pro Lys Asn Ala Gly Leu Gin 
3 5 4 0 4 5 



Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu Arg Pro 
50 55 60 

Asp He Lys Arg Gly Arg Phe Ser Phe Glu Glu Glu Glu Thr lie He 

65 70 75 80 

Gin Leu His Ser lie Met Gly Asn Lys Trp Ser Ala lie Ala Ala Arg 

85 90 95 
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Leu Pro Gly Arg Thr Asp Asn Glu He Lys Asn Tyr Trp Asn Thr His 
' 100 105 110 

He Arg Lys Arg Leu Leu Lys Met Gly He Asp Pro Val Thr His Thr 
115 120 125 



Pro Arg Leu Asp Leu Leu Asp lie Ser Ser He Leu Ser Ser Ser He 
130 135 140 



Tyr Asn Ser Ser His His His His His His His Gin Gin His Met Asn 
145 150 155 160 



Met Ser Arg Leu Met Met Ser Asp Gly Asn His Gin Pro Leu Val Asn 
165 170 175 

Pro Glu He Leu Lys Leu Ala Thr Ser Leu Phe Ser Asn Gin Asn His 
180 185 190 

Pro Asn Asn Thr His Glu Asn Asn Thr Val Asn Gin Thr Glu Val Asn 
195 200 205 

Gin Tyr Gin Thr Gly Tyr Asn Met Pro Gly Asn Glu Glu Leu Gin Ser 
210 215 * 220 



Trp Phe Pro He Met Asp Gin Phe Thr Asn Phe Gin Asp Leu Met Pro 
225 230 . 235 240 

Met Lys Thr Thr Val Gin Asn Ser Leu Ser Tyr Asp Asp Asp Cys Ser 
245 250 255 

Lys Ser Asn Phe Val Leu Glu Pro Tyr Tyr Ser Asp Phe Ala Ser Val 
260 265 . 270 



Leu Thr Thr Pro Ser Ser Ser Pro Thr Pro Leu Asn Ser Ser Ser Ser 
275 280 285 

Thr Tyr lie Asn Ser Ser Thr Cys Ser Thr Glu Asp Glu Lys Glu Ser 
290 295 * 300 

Tyr Tyr Ser Asp Asn He Thr Asn Tyr Ser Phe Asp Val Asn Gly Phe 
305 310 315 320 



Leu Gin Phe Gin 



<210> 27 

<211> 2162 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (40) . . (1809) 

<223> G584 

<400> 27 

aaaaagtctt ctcttttata actacgtcag agaactgtt atg tct ccg acg aat 54 

Met Ser Pro Thr Asn 
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1 5 



gtt caa gta acc gat tac cat etc aac caa tea aaa acg gat aca aca 
Val Gin Val Thr Asp Tyr His Leu Asn Gin Ser Lys Thr Asp Thr Thr 
10 15 20 



tta ate gaa gga gca aac gag aac tgg act tac gec gtg ttc tgg caa 
Leu lie Glu Gly Ala Asn Glu Asn Trp Thr Tyr Ala Val Phe Trp Gin 
70 75 80 85 



ttg tta ggt tgg gga gat ggt tat tac aaa gga gaa gaa gag aag tct 
Leu Leu Gly Trp Gly Asp Gly Tyr Tyr Lys Gly Glu Glu Glu Lys Ser 
105 HO 115 



ggt caa get ttc tea aat tea gac acg att tgg tta tct ggt tct aat 
Gly Gin Ala Phe Ser Asn Ser Asp Thr He Trp Leu Ser Gly Ser Asn 
185 190 195 



gtt gac acc ttt ttc aat ttt aac aat ggt ggt ggt gaa ttt ggt tct 
Val Asp Thr Phe Phe Asn Phe Asn Asn Gly Gly Gly Glu Phe Gly Ser 
250 255 260 

tgg gcg ttt aat ttg aat cca gat caa gga gag aat gat cca ggt ttg 
Trp Ala Phe Asn Leu Asn Pro Asp Gin Gly Glu Asn Asp Pro Gly Leu 
265 270 275 

tgg att agt gaa cct aat ggt gtt gac tct ggt ctt gta get get ccg 
Trp He Ser Glu Pro Asn Gly Val Asp Ser Gly Leu Val Ala Ala Pro 
280 285 290 

gtg atg aat aat ggt gga aat gac tea act tct aat tct gat tct caa 
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102 



aat etc tgg tea acc gac gac gat gca teg gta atg gaa get ttc ate . . 150 

Asn Leu Trp Ser Thr Asp Asp Asp Ala Ser Val Met Glu Ala Phe He 

25 30 35 

ggc ggc ggc tec gat cat tct tct ctt ttt cct cca ctt cct cct cct 198 

Gly Gly Gly Ser Asp His Ser Ser Leu Phe Pro Pro Leu Pro Pro Pro 
40 45 50 

cct ctt cct caa gtc aac gaa gat aat etc cag caa cgt etc caa get 246 

Pro Leu Pro Gin Val Asn Glu Asp Asn Leu Gin Gin Arg Leu Gin Ala 

55 60 " ' . 65 



294 



tea tct cac ggt ttc gee gga gaa gac aac aac aac aac aac aca gtg 342 
Ser Ser His Gly Phe Ala Gly Glu Asp Asn Asn Asn Asn Asn Thr Val 
90 95 100 



390 



aga aag aag aaa tea aat cca get agt gca get gaa caa gag cat cgt 438 

Arg Lys Lys Lys Ser Asn Pro Ala Ser Ala Ala Glu Gin Glu His Arg 
120 125 130 

aag aga gtg att aga gag etc aac tct tta ate tec ggt ggt gta gga 486 

Lys Arg Val He Arg Glu Leu Asn Ser Leu He Ser Gly Gly Val Gly 
135 ~ 14 0 14 5 

gga gga gat gaa get gga gat gaa gaa gtt aca gat act gaa tgg ttc 534 
Gly Gly Asp Glu Ala Gly Asp Glu Glu Val Thr Asp Thr Glu Trp. Phe 

150 155 160 165 

ttc tta gtt tea atg aca cag age ttt gtc aag ggt act ggt tta cct 582 

Phe Leu Val Ser Met Thr Gin Ser Phe Val Lys Gly Thr Gly Leu Pro 
170 175 180 



630 



get tta get gga tea agt tgt gag aga get cgt caa ggt cag att tat 678 

Ala Leu Ala Gly Ser Ser Cys Glu Arg Ala Arg Gin Gly Gin He Tyr 

200 205 210 

ggg tta caa aca atg gtg tgt gta gcg aca gag aat ggt gtc gtt gag 726 

Gly Leu Gin Thr Met Val Cys Val Ala Thr Glu Asn Gly Val Val Glu 

215 220 225 

ctt ggt teg teg gag att att cat caa agt tea gat ctt gtt gat aaa 774 

Leu Gly Ser Ser Glu lie lie His Gin Ser Ser Asp Leu Val Asp Lys 

230 235 240 245 



822 



870 



918 



966 
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Val Met Asn Asn Gly Gly Asn Asp Ser Thr Ser Asn Ser Asp Ser Gin 
295 * 300 305 



cca att tct aag ctt tgt aat gga age tct gtt gaa aac cct aac cct 
Pro lie Ser Lys Leu Cys Asn Gly Ser Ser Val Glu Asn Pro Asn Pro 
310 315 320 325 



1014 



aaa gtt ctg aaa tct tgt gaa atg gtg aat ttc. aag aat ggg att gag 
Lys Val Leu Lys Ser Cys Glu Met Val Asn Phe Lys Asn Gly lie Glu 
330 335 340 



1062 



aat ggt caa gaa gaa gat agt agt aat aag aag aga tea ccg gtt teg 
Asn Gly Gin Glu Glu Asp Ser Ser Asn Lys Lys Arg Ser Pro Val Ser 
345 . 350 " 355 



1110 



aat aat gaa gaa ggg atg ctt tct ttt acc tct gtt ctt cca tgt gac 
Asn Asn Glu Glu Gly Met Leu Ser Phe Thr Ser Val Leu Pro Cys Asp 
360 365 370 



1158 



teg aat cac tct gat ctt gaa get tea gtg get aaa gaa get gag agt 
Ser Asn His Ser Asp Leu Glu Ala Ser Val Ala Lys Glu Ala Glu Ser 
375 380 385 



1206 



aac aga gtt gtg gtt gaa ccg gag aag aaa ccg agg aaa cga ggg aga 
Asn Arg Val Val Val Glu Pro Glu Lys Lys Pro Arg Lys Arg Gly Arg 
390 395 400 405 



1254 



aaa ccg gcg aat gga aga gaa gag cct ttg aat cat gta gag gca gag 
Lys Pro Ala Asn Gly Arg Glu Glu Pro Leu Asn His Val Glu Ala Glu 
410 415 420 



1302 



aga cag aga aga gag aag ttg aat cag aga ttc tat tct tta aga get 
Arg Gin Arg Arg Glu Lys Leu Asn Gin Arg Phe Tyr Ser Leu Arg Ala 
425 430 435 



1350 



gtg gtt cct aat gtg tct aag atg gat aaa get tct eta tta gga gat 
Val Val Pro Asn Val Ser Lys Met Asp Lys Ala Ser Leu Leu Gly Asp 
440 445 ' . 450 



1398 



get att teg tat ate agt gag ctt aag tct aag ttg caa aag get gaa 
Ala lie Ser Tyr lie Ser Glu Leu Lys Ser Lys Leu Gin Lys Ala Glu 
455 460 465 



1446 



tct gat aaa gaa gag ttg cag aag cag att gat gtg atg aat aaa gaa 
Ser Asp Lys Glu Glu Leu Gin Lys Gin lie Asp Val Met Asn Lys Glu 
470 475 480 485 



1494 



9^9 99a aat gcg aaa agt teg gta aaa gat cga aaa tgt ttg aat caa 
Ala Gly Asn Ala Lys Ser Ser Val Lys Asp Arg Lys Cys Leu Asn Gin 
490 495 500 



1542 



gaa teg agt gtg ttg ata gag atg gag gtt gat gtg aag att att ggt 
Glu Ser Ser Val Leu lie Glu Met Glu Val Asp Val Lys lie He Gly 
505 510 515 



1590 



t99 gat gca atg ata agg att caa tgt agt aag agg aat cat cct ggt 
Trp Asp Ala Met He Arg He Gin Cys Ser Lys Arg Asn His Pro Gly 
520 " 525 J 530 



1638 



get aag ttc atg gaa gca ctt aag gag ttg gat ttg gaa gtg aat cat 
Ala Lys Phe Met Glu Ala Leu Lys Glu Leu Asp Leu Glu Val Asn His 
535 540 545 



1686 



gcg agt tta teg gta gtg aat gat ctt atg ate caa caa gcg act gtg 
Ala Ser Leu Ser Val Val Asn Asp Leu Met He Gin Gin Ala Thr Val 
550 555 560 565 



1734 



aaa atg ggg aat cag ttt ttc acg caa gat caa etc aag gtt get eta 
Lys Met Gly Asn Gin Phe Phe Thr Gin Asp Gin Leu Lys Val Ala Leu 
570 575 580 



1782 



acg gag aaa gtt gga gaa tgt cca tga attgaagtca gcatctttag 
Thr Glu Lys Val Gly Glu Cys Pro 
585 



1829 
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ggctaataca ccggagaata ctgcgaaaag tcgaaaacaa cgatcatagt ataagccgcg 1889 

gtaaaaagtg ttaaaccttt cacacaagtt. tctctagtga atgtagttgt aaactctatt 1949 

gtgtaagggt aattttgtag tacccacttg ttgctattga atgcttgtta gagaggattc 2009 

ttagtgtagt atatgattag gttggggttt gttgtttcat gagataaata aatgtgtttg 2069 . 

atcaatggtt aagtctttgg tttgttggtg tatgtatgta aataaggctt ttgttagaaa 2129 

taagacaaat gggactgaag ttggagttta aaa 2162 

<210> 28 
<211> 589 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 28 

Met Ser Pro Thr Asn Val Gin Val Thr Asp Tyr His Leu Asn Gin Ser 
15 10 15 

Lys Thr Asp Thr Thr Asn Leu Trp Ser Thr Asp Asp Asp Ala Ser Val 
20 25 30 

Met Glu Ala Phe lie Gly Gly Gly Ser Asp His Ser Ser Leu Phe Pro 
35 40 45 

Pro Leu Pro Pro Pro Pro Leu Pro Gin Val Asn Glu Asp Asn Leu Gin 
50 55 60 

Gin Arg Leu Gin Ala Leu He Glu Gly Ala Asn Glu Asn Trp Thr Tyr 
65 70 75 80 

Ala Val Phe Trp Gin Ser Ser His Gly Phe Ala Gly Glu Asp Asn Asn 
85 90 95 

Asn Asn Asn Thr Val Leu Leu Gly Trp Gly Asp Gly Tyr Tyr Lys Gly 
100 105 110 

Glu Glu Glu Lys Ser Arg Lys Lys Lys Ser Asn Pro Ala Ser Ala Ala 
115 120 * 125 

Glu Gin Glu His Arg Lys Arg Val He Arg Glu Leu Asn Ser Leu He 
130 135 140 

Ser Gly Gly Val Gly Gly Gly Asp Glu Ala Gly Asp Glu Glu Val Thr 
145 150 155 160 

Asp Thr Glu Trp Phe Phe Leu Val Ser Met Thr Gin Ser Phe Val Lys 
165 170 175 

Gly Thr Gly Leu Pro Gly Gin Ala Phe Ser Asn Ser Asp Thr He Trp 
180 185 190 

Leu Ser Gly Ser Asn Ala Leu Ala Gly Ser Ser Cys Glu Arg Ala Arg 
195 200 205 

Gin Gly Gin He Tyr Gly Leu Gin Thr Met Val Cys Val Ala Thr Glu 
210 215 220 
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Asn Gly Val Val Glu Leu Gly Ser Ser Glu lie lie His Gin Ser Ser 
225 230 235 240 

Asp Leu Val Asp Lys Val Asp Thr Phe Phe Asn Phe Asn Asn Gly Gly 
245 250 255 

Gly Glu Phe Gly Ser Trp Ala Phe Asn Leu Asn Pro Asp Gin Gly Glu 
260 265 270 



Asn Asp Pro Gly Leu Trp lie Ser Glu Pro Asn Gly Val Asp Ser Gly 
275 280 285 



Leu Val Ala Ala Pro Val Met Asn Asn Gly Gly Asn Asp Ser Thr Ser 
290 295 300 



Asn Ser Asp Ser Gin Pro lie Ser Lys Leu Cys Asn Gly Ser Ser Val 
305 310 315 320 

Glu Asri Pro Asn Pro Lys Val Leu Lys Ser Cys Glu Met Val Asn Phe 
325 330 335 



Lys Asn Gly He Glu Asn Gly Gin Glu Glu Asp Ser Ser Asn Lys Lys 
340 345 350 



Arg Ser Pro Val Ser Asn Asn Glu Glu Gly Met Leu Ser Phe Thr Ser 
355 360 ' 365 



Val Leu Pro Cys Asp Ser Asn His Ser Asp Leu Glu Ala Ser Val Ala 
370 375 380 

Lys Glu Ala Glu Ser Asn Arg Val Val Val Glu Pro Glu Lys Lys Pro 
385 390 395 400 

Arg Lys Arg Gly Arg Lys Pro Ala Asn Gly Arg Glu Glu Pro Leu Asn 
405 410 415 

His Val Glu Ala Glu Arg Gin Arg Arg Glu Lys Leu Asn Gin Arg Phe 
420 425 430 

Tyr Ser Leu Arg Ala Val Val Pro Asn Val Ser Lys Met Asp Lys Ala 
435 440 445 



Ser Leu Leu Gly Asp Ala He Ser Tyr He Ser Glu Leu Lys Ser Lys 
450 455 460 

Leu Gin Lys Ala Glu Ser Asp Lys Glu Glu Leu Gin Lys Gin He Asp 
465 470 475 480 

Val Met Asn Lys Glu Ala Gly Asn Ala Lys Ser Ser Val Lys Asp Arg 
485 490 495 



Lys Cys Leu Asn Gin Glu Ser Ser Val Leu He Glu Met Glu Val Asp 
500 505 510 



Val Lys He He Gly Trp Asp Ala Met He Arg He Gin Cys Ser Lys 
515 * 520 525 
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Arg Asn His Pro Gly Ala Lys Phe Met Glu Ala Leu Lys Glu Leu Asp 
530 535 540 

Leu Glu Val Asn His Ala Ser Leu Ser Val Val Asn Asp Leu Met He 
545 550 555 560 

Gin Gin Ala Thr Val Lys Met Gly Asn Gin Phe Phe Thr Gin Asp Gin 
565 570 575 

Leu Lys Val Ala Leu Thr Glu Lys Val Gly Glu Cys Pro 
580 585 



<210> 


29 


<211> 


1056 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<221> 


CDS 


<222> 


(1) . . (1056) 


<223> 


G668 


<400> 


29 



caa gcc ctt tta ggc aac aga tgg gca gcc ata gca tea tac ctt cca 
Gin Ala Leu Leu Gly Asn Arg Trp Ala Ala He Ala Ser Tyr Leu Pro 
85 90 95 



aag aag etc aaa aag att aat gaa tct ggt gaa gaa gat aat gat ggt 

Lys Lys Leu Lys Lys He Asn Glu Ser Gly Glu Glu Asp Asn Asp Gly 

115 120 125 

gtc tct tea tea aac act agt tea caa aag aac cat caa age act aac 

Val Ser Ser Ser Asn Thr Ser Ser Gin Lys Asn His Gin Ser Thr Asn 

130 135 140 

aaa ggt caa tgg gaa aga aga ctt cag aca gac att aac atg gca aaa 

Lys Gly Gin Trp Glu Arg Arg Leu Gin Thr Asp lie Asn Met Ala Lys 

lis 150 155 160 

caa get ctt tgt gag gcc ttg tct tta gac aaa cca tea tec act ctt 

Gin Ala Leu Cys Glu Ala Leu Ser Leu Asp Lys Pro Ser Ser Thr Leu 

165 170 175 

tea tea tct tea tea tta ccg aca cca gta ate aca caa caa aac ate 
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96 



atg gga aga cca cct tgc tgt gaa aag att gga gtg aag aaa ggg cca 4 8 

Met Gly Arg Pro Pro Cys Cys Glu Lys He Gly Val Lys Lys Gly Pro 
1 5 10 15 

tgg aca cca gag gaa gac ate ate ttg gtt tct tac ate caa gaa cat 
Trp Thr Pro Glu Glu Asp He He Leu Val Ser Tyr He Gin Glu His 
20 ' 25 30 

ggt cct gga. aac tgg aga tct gtc cca aca cac aca ggt tta aga tgt 144 
Gly Pro Gly Asn Trp Arg Ser Val Pro Thr His Thr Gly Leu Arg Cys 
35 40 45 

age aag age tgc aga ttg aga tgg act aat tat ctt cga ccc ggt att 192 
Ser Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu Arg Pro Gly He 
50 55 60 

aag cgt gga aat ttt act gag cat gaa gag aag aca att gtt cat ctt 240 
Lys Arg Gly Asn Phe Thr Glu His Glu Glu Lys Thr He Val His Leu 
65 70 75 80 



288 



gaa agg aca gac aat gat ata aag aac tat tgg aac act cac ttg aag 336 
Glu Arg Thr Asp Asn Asp He Lys Asn Tyr Trp Asn Thr His Leu Lys 
100 105 HO 



384 



432 



480 



528 



576 
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Ser Ser Ser Ser Ser Leu Pro Thr Pro Val lie Thr Gin Gin Asn lie 
180 185 190 

cgt aac ttc tea tea get ttg ctt gac cgt tgt tat gat oca tec tct 624 
Arg Asn Phe Ser Ser Ala Leu Leu Asp Arg Cys Tyr Asp Pro Ser Ser 
195 200 205 

tct tct tea tct accaca ace ace act aca age aac act act aat cca 672 
Ser Ser Ser Ser Thr Thr Thr Thr Thr Thr Ser Asn Thr Thr Asn Pro 
210 215 220 

tac cca tea ggg gta tat gcg tea agt get gag aac ate gec egg ttg 720 
Tyr Pro Ser Gly Val Tyr Ala Ser Ser Ala Glu Asn lie Ala Arg Leu 
225 230 235 . 240 

ctt caa gat ttc atg aaa gac aca ccc aag get tta act tta tea tct 768 
Leu Gin Asp Phe Met Lys Asp Thr Pro Lys Ala Leu Thr Leu Ser Ser 
245 250 255 

tea tct ccg gtt tea gag act gga cca etc act get gca gtc teg gaa 816 
Ser Ser Pro Val Ser Glu Thr Gly Pro Leu Thr Ala Ala Val Ser Glu 
. 260 265 270 

gaa ggt gga gaa ggg ttt gaa caa tct ttc ttc age ttc aat tea atg 864 
Glu Gly Gly Glu Gly Phe Glu Gin Ser Phe Phe Ser Phe Asn Ser Met 
275 280 285 

gac gaa act caa aac ttg act cag- gag aca age ttc ttc cat gat caa 912 
Asp Glu Thr Gin Asn Leu Thr Gin Glu Thr Ser Phe Phe His Asp Gin 
290 295 300 

gtg ate aaa ccg gaa ata aca atg gac caa gat cat ggt eta ata tea 960 
Val He Lys Pro Glu He Thr Met Asp Gin Asp His Gly Leu He Ser 
305 310 315 " 320 

caa ggg tct ctg tct ttg ttt gag aaa tgg tta ttt gat gag caa age 1008 
Gin Gly Ser Leu Ser Leu Phe Glu Lys Trp Leu Phe Asp Glu Gin Ser 
325 330 335 

cac gag atg gtt ggt atg gca eta gca gga caa gaa ggg atg ttc tag 1056 
His Glu Met Val Gly Met Ala Leu Ala Gly Gin Glu Gly Met Phe 
340 345 350 

<210> 30 
<211> 351 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 30 

Met Gly Arg Pro Pro Cys Cys Glu Lys He Gly Val Lys Lys Gly Pro 
1 5 10 ' 15 

Trp Thr Pro Glu Glu Asp He lie Leu Val Ser Tyr He Gin Glu His 
20 25 30 

Gly Pro Gly Asn Trp Arg Ser Val Pro Thr His Thr Gly Leu Arg Cys 
35 40 45 

Ser Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu Arg Pro Gly He 
50 55 60 

Lys Arg Gly Asn Phe Thr Glu His Glu Glu Lys Thr He Val His Leu 
65 70 75 80 

Gin Ala Leu Leu Gly Asn Arg Trp Ala Ala He Ala Ser Tyr Leu Pro 
85 90 95 
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Glu Arg Thr Asp Asn Asp lie Lys Asn Tyr Trp Asn Thr His Leu Lys 
.100 .105 110 

Lys Lys Leu Lys Lys lie Asn Glu Ser Gly Glu Glu Asp Asn Asp Gly 
115 120 125 

Val Ser Ser Ser Asn Thr Ser Ser Gin Lys Asn His Gin Ser Thr Asn 
130 . 135 140 

Lys Gly Gin Trp Glu Arg Arg Leu Gin Thr Asp He Asn Met Ala Lys 
145 150 155 160 

Gin Ala Leu Cys Glu Ala Leu Ser Leu Asp Lys Pro Ser Ser Thr Leu 
165 170 175 

Ser Ser Ser Ser Ser Leu Pro Thr Pro Val He Thr Gin Gin Asn He 
180 185 190 

Arg Asn Phe Ser Ser Ala Leu Leu Asp Arg Cys Tyr Asp Pro Ser Ser 
195 200 205 

Ser Ser Ser Ser Thr Thr Thr Thr Thr Thr Ser Asn Thr Thr Asn Pro 
210 215 220 

Tyr Pro Ser Gly Val Tyr Ala Ser Ser Ala Glu Asn He Ala Arg Leu 
225 . 230 235 240 

Leu Gin Asp Phe Met Lys Asp Thr Pro Lys Ala Leu Thr Leu Ser Ser 
245 250 255 

Ser Ser Pro Val Ser Glu Thr Gly Pro Leu Thr Ala Ala Val Ser Glu 
260 265 270 

Glu Gly Gly Glu Gly Phe Glu Gin Ser Phe Phe Ser Phe Asn Ser Met 
275 280 285 

Asp Glu Thr Gin Asn Leu Thr Gin Glu Thr Ser Phe Phe His Asp Gin 
290 295 300 

Val He Lys Pro Glu He Thr Met Asp Gin Asp His Gly Leu He Ser 
305 310 315 320 

Gin Gly Ser Leu Ser Leu Phe Glu Lys Trp Leu Phe Asp Glu Gin Ser 
325 330 335 

His Glu Met Val Gly Met Ala Leu Ala Gly Gin Glu Gly Met Phe 
340 345 350 



<210> 


31 


<211> 


2526 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(338) . . (2275) 


<223> 


. G680 



Page 43 



n-t^«;707Ai I 



WO 01/35727 



PCT/US00/31457 



MBI-17 Sequence Listing . ST25 

<400> 31 

cagttatctt cttccttctt ctctctgttt tttaaattta tttttagaga attttttttg. 60 

ttttgcttcc gatttgatta tttccgggaa cgatgacttc tccggggagt tcccggtgag 120 

atgataagtc agattgcata cttgtctcct ccatggctac tctcaagggt tttggctgcg 180 

gtggattcgt ttggtttctc tagaatctaa agaggttatc acaacggctt tgcaatttga 240 

aaactttcat gtttggggag atcaaagatg gtttcttttt tatactttac ttgttagaga 300 

ggatttgaag cagcgaatag ctgcaaccgg tcctgtt atg gat act aat aca tct 355 

Met Asp Thr Asn Thr Ser 
1 5 

gga gaa gaa tta tta get aag gca aga aag cca tat aca ata aca aag 4 03 

Gly Glu Glu Leu Leu Ala Lys Ala Arg Lys Pro Tyr Thr lie Thr Lys 
10 15 20 

cag cga gag cga tgg act gag gat gag cat gag agg ttt eta gaa gec 451 
Gin Arg Glu Arg Trp Thr Glu Asp Glu His Glu Arg Phe Leu Glu Ala 
25 30 35 

tt S a 99 ctt tat gga aga get tgg caa cga att gaa gaa cat att ggg 499 
Leu Arg Leu Tyr Gly Arg Ala Trp Gin Arg He Glu Glu His He Gly 
40 45 * 50 

aca aag act get gtt cag ate aga agt cat gca caa aag ttc ttc aca 547 
Thr Lys Thr Ala Val Gin He Arg Ser His Ala Gin Lys Phe Phe Thr 
55 60 65 70 

aag ttg gag aaa gag get gaa gtt aaa ggc ate cct gtt tgc caa get 595 
Lys Leu Glu Lys Glu Ala Glu Val Lys Gly He Pro Val Cys Gin Ala 
75 80 85 

ttg gac ata gaa att ccg cct cct cgt cct aaa cga aaa ccc aat act 643 
Leu Asp He Glu He Pro Pro Pro Arg Pro Lys Arg Lys Pro Asn Thr 
90 95 * " 100 

cct tat cct cga aaa cct ggg aac aac ggt aca tct tec tct caa gta 691 
Pro Tyr Pro Arg Lys Pro Gly Asn Asn Gly Thr Ser Ser Ser Gin Val 
105 110 115 

tea tea gca aaa gat gca aaa ctt gtt tea teg gee tct tct tea cag 739 
Ser Ser Ala Lys Asp Ala Lys Leu Val Ser Ser Ala Ser Ser Ser Gin 
120 125 130 

ttg aat cag gcg ttc ttg gat ttg gaa aaa atg ccg ttc tct gag aaa 787 
Leu Asn Gin Ala Phe Leu Asp Leu Glu Lys Met Pro Phe Ser Glu Lys 
135 140 145 150 

aca tea act gga aaa gaa aat caa gat gag aat tgc teg ggt gtt tct 83 5 

Thr Ser Thr Gly Lys Glu Asn Gin Asp Glu Asn Cys Ser Gly Val Ser 
155 ' 160 * 165 

act gtg aac aag tat ccc tta cca acg aaa cag gta agt ggc gac att 883 
Thr Val Asn Lys Tyr Pro Leu Pro Thr Lys Gin Val Ser Gly Asp He 
170 175 180 

gaa aca agt aag acc tea act gtg gac aac gcg gtt caa gat gtt ccc 931 
Glu Thr Ser Lys Thr Ser Thr Val Asp Asn Ala Val Gin Asp Val Pro 
185 190 ~ 195 

aag aag aac aaa gac aaa gat ggt aac gat ggt act act gtg cac age 979 
Lys Lys Asn Lys Asp Lys Asp Gly Asn Asp Gly Thr Thr Val His Ser 
200 ' 205 210 

atg caa aac tac cct tgg cat ttc cac gca gat att gtg aac ggg aat 1027 
Met Gin Asn Tyr Pro Trp His Phe His Ala Asp lie Val Asn Gly Asn 
215 220 225 230 

ata gca aaa tgc cct caa aat cat ccc tea ggt atg gta tct caa gac 1075 
He Ala Lys Cys Pro Gin Asn His Pro Ser Gly Met Val Ser Gin Asp 
235 240 245 
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ttc atg ttt cat cct atg aga gaa gaa act cac ggg cac gca aat ctt 
Phe Met Phe His Pro Met Arg Glu Glu Thr His Gly His Ala Asn Leu 
250 255 . . 260 

caa get aca aca gca tct get act act aca get tct cat caa gcg ttt 
Gin Ala Thr Thr Ala Ser Ala Thr Thr Thr Ala Ser His Gin Ala Phe 
265 270 275 

cca get tgt cat tea cag gat gat tac cgt teg ttt etc cag ata tea 
Pro Ala Cys His Ser Gin Asp Asp Tyr Arg Ser Phe Leu Gin lie Ser 
280 285 290 

tct act ttc tec aat ctt att atg tea act etc eta cag aat cct gca 
Ser Thr Phe Ser Asn Leu He Met Ser Thr Leu Leu Gin Asn Pro Ala 
295 300 305 310 

get cat get gca get aca ttc get get teg gtc tgg cct tat gcg agt 
Ala His Ala Ala Ala Thr Phe Ala Ala Ser Val Trp Pro Tyr Ala Ser 
315 320 325 

gtc ggg aat tct ggt gat tea tea ace cca atg age tct tct cct cca 
val Gly Asn Ser Gly Asp Ser Ser Thr Pro Met Ser Ser Ser Pro Pro 
330 335 340 

agt ata act gee att . gee get get aca gta get get gca act get tgg 
Ser He Thr Ala He Ala Ala Ala Thr Val Ala Ala Ala Thr Ala Trp 
345 350 355 

tgg get tct cat gga ctt ctt cct gta tgc get cca get cca ata aca 
Trp Ala Ser His Gly Leu Leu Pro Val Cys Ala Pro Ala Pro He Thr 
360 ~ 365 370 

tgt gtt cca ttc tea act gtt gca gtt cca act cca gca atg act gaa 
Cys Val Pro Phe Ser Thr Val Ala Val Pro Thr Pro Ala Met Thr Glu 
375 380 385 390 

atg gat ace gtt gaa aat act caa ccg ttt gag aaa caa aac aca get 
Met Asp Thr Val Glu Asn Thr Gin Pro Phe Glu Lys Gin Asn Thr Ala 
395 400 405 

ctg caa gat caa acc ttg get teg aaa tct cca get tea tea tct gat 
Leu Gin Asp Gin Thr Leu Ala Ser Lys Ser Pro Ala Ser Ser Ser Asp 
410 415 420 

gat tea gat gag act gga gta acc aag eta aat gee gac tea aaa acc 
Asp Ser Asp Glu Thr Gly Val Thr Lys Leu Asn Ala Asp Ser Lys Thr 
425 430 435 

aat gat gat aaa att gag gag gtt gtt gtt act gee get gtg cat gac 
Asn Asp Asp Lys He Glu Glu Val Val Val Thr Ala Ala Val His Asp 
440 445 450 

tea aac act gec cag aag aaa aat ctt gtg gac cgc tea teg tgt ggc 
Ser Asn Thr Ala Gin Lys Lys Asn Leu Val Asp Arg Ser Ser Cys Gly 
455 460 465 470 

tea aat aca cct tea ggg agt gac gca gaa act gat gca tta gat aaa 
Ser Asn Thr Pro Ser Gly Ser Asp Ala Glu Thr Asp Ala Leu Asp Lys 
475 480 485 

atg gag aaa gat aaa gag gat gtg aag gag aca gat gag aat cag cca 
Met Glu Lys Asp Lys Glu Asp Val Lys Glu Thr Asp Glu Asn Gin Pro 
490 495 500 

gat gtt att gag tta aat aac cgt aag att aaa atg aga gac aac aac 
Asp Val He Glu Leu Asn Asn Arg Lys He Lys Met Arg Asp Asn Asn 
505 510 515 

age aac aac aat gca act act gat teg tgg aag gaa gtc tec gaa gag 
Ser Asn Asn Asn Ala Thr Thr Asp Ser Trp Lys Glu Val Ser Glu Glu 
520 525 530 

ggt cgt ata gcg ttt cag get etc ttt gca aga gaa aga ttg cct caa 
Gly Arg He Ala Phe Gin Ala Leu Phe Ala Arg Glu Arg Leu Pro Gin 
535 540 545 550 
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1891 



1939 



1987 



QMcnnrm- ^\*ir\ 



WO 01/35727 



PCT/USOO/31457 



MBI-17 Sequence Listing. ST25 

age ttt teg cct cct caa gtg gca gag aat gtg aat aga aaa caa agt 2035 
Ser Phe Ser Pro Pro Gin Val Ala Glu Asn Val Asn Arg Lys Gin Ser 
555 560 565 

gac acg.tca atg cca ttg get cct aat ttc aaa age cag gat tct tgt 2083 
Asp Thr Ser Met Pro Leu Ala Pro Asn Phe Lys Ser Gin Asp Ser Cys 
570 575 580 

get gca gac caa gaa gga gta gta atg ate ggt gtt gga aca tgc aag 2131 
Ala Ala Asp Gin Glu Gly Val Val Met He Gly Val Gly Thr Cys Lys 
585 590 595 

agt ctt aaa acg aga cag aca gga ttt aag cca tac aag aga tgt tea 2179 
Ser Leu Lys Thr Arg Gin Thr Gly Phe Lys Pro Tyr Lys Arg Cys Ser 
600 " 605 610 

atg gaa gtg aaa gag age caa gtt ggg aac ata aac aat caa agt gat 2227 
Met Glu Val Lys Glu Ser Gin Val Gly Asn lie Asn Asn Gin Ser Asp 
615 620 625 630 

gaa aaa gtc tgc aaa agg ctt cga ttg gaa gga gaa get tct aca tga 2275 
Glu Lys Val Cys Lys Arg Leu Arg Leu Glu Gly Glu Ala Ser Thr 
635 640 645 

cagacttgga ggtaaaaaaa aaacatccac atttttatca atatctttaa atctagtgtt 2335 

agtagtttgc ttctccaatc tttatgaaag agacttttaa ttttccttcc gaacatttct 2395 

ttggtcatgt caggttctgt accatattac cccatgtctt gtctcttgtc tctgtttgtg 2455 

tatgetaett gtggtctata tgtcatctgc tactactgtt aattaaccat taagcaatgg 2515 

atttgtcttt a * 2526 

<210> 32 

<211> 645 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 32 

Met Asp Thr Asn Thr Ser Gly Glu Glu Leu Leu Ala Lys Ala Arg Lys 
1.5 10 15 

Pro Tyr Thr He Thr Lys Gin Arg Glu Arg Trp Thr Glu Asp Glu His 
20 25 30 

Glu Arg Phe Leu Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Arg 
35 40 45 

He Glu Glu His He Gly Thr Lys Thr Ala Val Gin He Arg Ser His 
50 55 60 

Ala Gin Lys Phe Phe Thr Lys Leu Glu Lys Glu Ala Glu Val Lys Gly 
65 70 ' 75 80 

He Pro Val Cys Gin Ala Leu Asp He Glu He Pro Pro Pro Arg Pro 
85 90 95 

Lys Arg Lys Pro Asn Thr Pro Tyr Pro Arg Lys Pro Gly Asn Asn Gly 
100 105 110 

Thr Ser Ser Ser Gin Val Ser Ser Ala Lys Asp Ala Lys Leu Val Ser 
115 120 125 
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MBI-17 Sequence Listing. ST25 
Ser Ala Ser Ser Ser Gin Leu Asn Gin Ala Phe Leu Asp Leu Glu Lys 
130 135 140 

Met Pro Phe Ser Glu Lys Thr Ser Thr Gly Lys Glu Asn Gin Asp Glu 
145 150 155 160 

Asn Cys Ser Gly Val Ser Thr Val Asn Lys Tyr Pro Leu Pro Thr Lys 
165 170 175 

Gin Val Ser Gly Asp He Glu Thr Ser Lys Thr Ser Thr Val Asp Asn 
180 185 . 190 

Ala Val Gin Asp Val Pro Lys Lys Asn Lys Asp Lys Asp Gly Asn Asp 
195 200 205 

Gly Thr Thr Val His Ser Met Gin Asn Tyr Pro Trp His Phe His Ala 
210 215 220 

Asp He Val Asn Gly Asn He Ala Lys Cys Pro Gin Asn His Pro Ser 
225 230 235 240 

Gly Met Val Ser Gin Asp Phe Met Phe His Pro Met Arg Glu Glu Thr 
245 250 255 

His Gly His Ala Asn Leu Gin Ala Thr Thr Ala Ser Ala Thr Thr Thr 
260 265 270 

Ala Ser His Gin Ala Phe Pro Ala Cys His Ser Gin Asp Asp Tyr Arg 
275 280 285 

Ser Phe Leu Gin He Ser Ser Thr Phe Ser Asn Leu He Met Ser Thr 
290 295 300 

Leu Leu Gin Asn Pro Ala Ala His Ala Ala Ala Thr Phe Ala Ala Ser 
305 310 315 . 320 

Val Trp Pro Tyr Ala Ser Val Gly Asn Ser Gly Asp Ser Ser Thr Pro 
325 330 335 

Met Ser Ser Ser Pro Pro Ser He Thr Ala He Ala Ala Ala Thr Val 
340 345 350 

Ala Ala Ala Thr Ala Trp Trp Ala Ser His Gly Leu Leu Pro Val Cys 
355 360 365 

Ala Pro Ala Pro lie Thr Cys Val Pro Phe Ser Thr Val Ala Val Pro 
370 375 380 

Thr Pro Ala Met Thr Glu Met Asp Thr Val Glu Asn Thr Gin Pro Phe 
385 390 395 400 

Glu Lys Gin Asn Thr Ala Leu Gin Asp Gin Thr Leu Ala Ser Lys Ser 
405 410 415 

Pro Ala Ser Ser Ser Asp Asp Ser Asp Glu Thr Gly Val Thr Lys Leu 
420 425 430 
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Asn Ala Asp Ser Lys Thr Asn Asp Asp Lys He Glu Glu Val Val Val 
435 440 445 

Thr Ala Ala Val His Asp Ser Asn Thr Ala Gin Lys Lys Asn Leu Val 
450 455 460 

Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Gly Ser Asp Ala Glu 
465 " 470 475 480 

Thr Asp Ala Leu Asp Lys Met Glu Lys Asp Lys Glu Asp Val Lys Glu 
485 490 495 

Thr Asp Glu Asn Gin Pro Asp Val He Glu Leu Asn Asn Arg Lys He 
500 505 510 

Lys Met Arg Asp Asn Asn Ser Asn Asn Asn Ala Thr Thr Asp Ser Trp 
515 520 525 

Lys Glu Val Ser Glu Glu Gly Arg He Ala Phe Gin Ala Leu Phe Ala 
530 535 540 

Arg Glu Arg Leu Pro Gin Ser Phe Ser Pro Pro Gin Val Ala Glu Asn 
545 550 555 560 

Val Asn Arg Lys Gin Ser Asp Thr Ser Met Pro Leu Ala Pro Asn Phe 
565 570 575 

Lys Ser Gin Asp Ser Cys Ala Ala Asp Gin Glu Gly Val Val Met He 
580 585 590 

Gly Val Gly Thr Cys Lys Ser Leu Lys Thr Arg Gin Thr Gly Phe Lys 
595 600 605 

Pro Tyr Lys Arg Cys Ser Met Glu Val Lys Glu Ser Gin Val Gly Asn. 
610 615 620 

He Asn Asn Gin Ser Asp Glu Lys Val Cys Lys Arg Leu Arg Leu Glu 
625 630 635 640 

Gly Glu Ala Ser Thr 
645 

<210> 33 
<211> 228 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (1) . . (228) 

<223> G682 

<400> 33 

atg gat aac cat cgc agg act aag caa ccc aag acc aac tec ate gtt 48 
Met Asp Asn His Arg Arg Thr Lys Gin Pro Lys Thr Asn Ser He Val 
1 5 10 15 

act tct tct tct gaa gaa gtg agt agt ctt gag tgg gaa gtt gtg aac 96 
Thr Ser Ser Ser Glu Glu Val Ser Ser Leu Glu Trp Glu Val Val Asn 
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MBI-17 Sequence Listing. ST25 
20 25 30 

atg agt caa gaa gaa gaa gat ttg gtc tct cga atg cat aag ctt gtc 144 
Met Ser Gin Glu Glu Glu Asp Leu Val Ser Arg Met His Lys Leu Val 
35 40 45 

ggt gac agg tgg gaa ctg ata get ggg agg ate cca gga aga acc get 192 
Gly Asp Arg Trp Glu Leu lie Ala Gly Arg He Pro Gly Arg Thr Ala 
50 55 60 

gga gaa att gag agg ttt tgg gtc atg aaa aat tga 228 
Gly Glu lie Glu Arg Phe Trp Val Met Lys Asn 
65 70 75 

<210> 34 
<211> 75 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 34 

Met Asp Asn His Arg Arg Thr Lys Gin Pro Lys Thr Asn Ser He Val 
15 10 15 

Thr Ser Ser Ser Glu Glu Val Ser Ser Leu Glu Trp Glu Val Val Asn 
20 25 30 

Met Ser Gin Glu Glu Glu Asp Leu Val Ser Arg Met His Lys Leu Val 
35 40 45 

Gly Asp Arg Trp Glu Leu He Ala Gly Arg He Pro Gly Arg Thr Ala 
50 ~ 55 60 

Gly Glu He Glu Arg Phe Trp Val Met Lys Asn 
65 70 75 

<210> 35 

<211> 584 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (157).. (441) 

<223> G225 

<400> 35 

ctctctctct cactcttttc ttttccgaga acccaacaaa aaaaaagcta ctattaatcc 60 

ttcccctcgt gaggaaatca tttcttcttg tttctcgaga tttattctct ttctctctct 120 

ctttctctgt gtgtttcgtg tcttcagatt agttcg atg ttt cgt tea gac aag 174 

Met Phe Arg Ser Asp Lys 
1 5 

gcg gaa aaa atg gat aaa cga cga egg aga cag age aaa gee aag get 222 
Ala Glu Lys Met Asp Lys Arg Arg Arg Arg Gin Ser Lys Ala Lys Ala 
10 15 20 

tct tgt tec gaa gag gtg agt agt ate gaa tgg gaa get gtg aag atg 270 
Ser Cys Ser Glu Glu Val Ser Ser He Glu Trp Glu Ala Val Lys Met 
25 30 35 

tea gaa gaa gaa gaa gat etc att tct egg atg tat aaa etc gtt ggc 
Ser Glu Glu Glu Glu Asp Leu He Ser Arg Met Tyr Lys Leu Val Gly 
40 ' 45 50 

gac agg tgg gag ttg ate gee gga agg ate ccg gga egg acg ccg gag 366 
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MBI-17 Sequence Listing.ST2 5 
Asp Arg Trp Glu Leu He Ala Gly Arg He Pro Gly Arg Thr Pro Glu 
55 .60 65 70 

gag ata gag aga tat tgg ctt atg aaa cac ggc gtc gtt ttt gcc aac 414 
Glu He Glu Arg Tyr Trp Leu Met Lys His Gly Val Val Phe Ala Asn 
75 80 85 

aga cga aga gac ttt ttt agg aaa tga ttttttttgt ttggattaaa . 461 
Arg Arg -Arg Asp Phe Phe Arg Lys 
" 90 

agaaaatttt cctctcctta attcacaaga caagaaaaaa aggaaatgta cctgtccttg 521 

aattactatt ttggaatgta taattatcta tatatataag aagaaaaaat tgcttaggaa 581 

ttt 584 



<210> 36 
<211> 94 
<212> PRT 

<213> Arabidopsis thaliana 










<400> 36 












Met Phe Arg 
1 


Ser Asp Lys Ala Glu Lys Met 
5 10 


Asp 


Lys 


Arg 


Arg Arg Arg 
15 


Gin Ser Lys 


Ala Lys Ala Ser Cys Ser Glu 
20 25 


Glu 


Val 


Ser 


Ser He Glu 
30 


Trp Glu Ala 
35 


Val Lys Met Ser Glu Glu Glu 
40 


Glu Asp 


Leu 
45 


lie Ser Arg 


Met Tyr Lys 
50 


Leu Val Gly Asp Arg Trp Glu 
55 


Leu 


He 
60 


Ala 


Gly Arg He 



Pro Gly Arg Thr Pro Glu Glu He Glu Arg Tyr Trp Leu Met Lys His 
65 70 75 80 



Gly Val Val Phe Ala Asn Arg Arg Arg Asp Phe Phe Arg Lys 





85 


<210> 


37 


<211> 


1369 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(104) . . (1174) 


<223> 


G678 



90 



<400> 37 

ggttgtgtcg tcactttact tctctttaaa taatctcttt ctctagtgat tgactttagg 60 

aacaagtgaa gtgagtgatt tttaataatc gccggccgag aga atg gga agg gcg 115 

Met Gly Arg Ala 
1 

ccg tgc tgt gag aag gta gga att aag aag ggg cgc tgg acg gcg gag 163 
Pro Cys Cys Glu Lys Val Gly He Lys Lys Gly Arg Trp Thr Ala Glu 
5 10 15 20 

gaa gac egg act etc tec gac tac att cag tec aac ggc gaa gga tea 211 
Glu Asp Arg Thr Leu Ser Asp Tyr He Gin Ser Asn Gly Glu Gly Ser 
25 30 35 



Page 50 



BNSDOCID: <WO 0135727A1 L> 



j 



WO 01/35727 PCT/US00/31457 
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tgg cgt tct ctt ccc aaa aat gcc ggg eta aag aga tgt gga aag age 259 
Trp Arg Ser Leu Pro Lys Asn Ala Gly Leu Lys Arg Cys Gly Lys Ser 
40 45 * 50 

tgt aga ttg aga tgg ata aac tat ttg aga tea gac ate aag aga gga 307 
Cys Arg Leu Arg Trp lie Asn Tyr Leu Arg Ser Asp lie Lys Arg Gly 
55 60 65 

aac ata act ccc gaa gaa gag gac gtc att gtt aaa ctg cat tec act 355 
Asn He Thr Pro Glu Glu Glu Asp Val He Val Lys Leu His Ser Thr 
70 75 80 

ttg gga ace agg tgg tea aca att gcg age aat eta ccg gga aga aca 403 
Leu Gly Thr Arg Trp Ser Thr He Ala Ser Asn Leu Pro Gly Arg Thr 
85 90 95 100 

gac aac gaa ata aaa aac tat tgg aat tct cat etc age cgt aaa etc 451 
Asp Asn Glu He Lys Asn Tyr Trp Asn Ser His Leu Ser Arg Lys Leu 
105 110 115 

cac ggt tac ttc aga aaa cca act gtc gcc aat acc gtc gag aat gcg 4 99 

His Gly Tyr Phe Arg Lys Pro Thr Val Ala Asn Thr Val Glu Asn Ala 
120 125 130 

cct ccg cct cct aag cgt aga cct gga aga acc age aga tec gcc atg 547 
Pro Pro Pro Pro Lys Arg Arg Pro Gly Arg Thr Ser Arg Ser Ala Met 
135 140 145 

aaa ccc aaa ttt ate eta aac cct aaa aac cac aaa acc cct aat tct 595 
Lys Pro Lys Phe He Leu Asn Pro Lys Asn His Lys Thr Pro Asn Ser 
150 155 160 

ttt aaa gca aac aaa agt gac ate gtt ttg cca act acg aca ata gag 643 
Phe Lys Ala Asn Lys Ser Asp He Val Leu Pro Thr Thr Thr He Glu 
165 170 175 180 

aat gga gag gga gac aaa gaa gac gca tta atg gtg ttg tea agt agt 691 
Asn Gly Glu Gly Asp Lys Glu Asp Ala Leu Met Val Leu Ser Ser Ser 
185 190 195 

age tta agt gga gca gag gaa ccc ggt tta gga cca tgt ggt tat gga 739 
Ser Leu Ser Gly Ala Glu Glu Pro Gly Leu Gly Pro Cys Gly Tyr Gly 
200 205 210 

gac gat ggc gat tgt aac cca age att aat ggc gac gat gga get ttg 787 
Asp Asp Gly Asp Cys Asn Pro Ser He Asn Gly Asp Asp Gly Ala Leu 
215 220 225 

tgt etc aat gac gac att ttc gat tct tgt ttt eta ttg gac gac tct 835 
Cys Leu Asn Asp Asp lie Phe Asp Ser Cys Phe Leu Leu Asp Asp Ser 
230 * 235 240 



cat get gtc cac gtg tec tea tgt gag teg aac aac gta aaa aac tct 
His Ala Val His Val Ser Ser Cys Glu Ser Asn Asn Val Lys Asn Ser 
245 250 255 260 



883 



gag cca tat gga ggg atg tea gtt ggg cac aaa aat ate gaa acg atg 931 

Glu Pro Tyr Gly Gly Met Ser Val Gly His Lys Asn He Glu Thr Met 
265 270 275 

get gat gat ttc gtt gac tgg gac ttt gta tgg aga gaa ggt caa acc 979 

Ala Asp Asp Phe Val Asp Trp Asp Phe Val Trp Arg Glu Gly Gin Thr 
280 285 290 

ctt tgg gac gaa aaa gag gat ctt gat teg gtt ttg teg agg ctg tta 1027 

Leu Trp Asp Glu Lys Glu Asp Leu Asp Ser Val Leu Ser Arg Leu Leu 

295 ' 300 305 

gat gga gag gaa atg gaa tct gag ate aga caa agg gac tec aac gac 1075 

Asp Gly Glu Glu Met Glu Ser Glu He Arg Gin Arg Asp Ser Asn Asp 
310 315 320 

ttt gga gaa ccg ttg gat att gac gaa gaa aac aag atg get get tgg 1123 

Phe Gly Glu Pro Leu Asp He Asp Glu Glu Asn Lys Met Ala Ala Trp 

325 330 335 340 
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ctt ttt tec tta aaa att tta ccc cct tec ttt tec ctt ttc ccc ctt 1171 
Leu Phe Ser Leu Lys lie Leu Pro Pro Ser Phe Ser Leu Phe Pro Leu 
345 • 350 355 

taa tttttaccaa aaccccccct tgccagatcc tgtccgtttt tccattaaac 1224 

ctttttctcc ccctaccttc ctttttttat ttttaatttt ttttttttcc tttttttttc 1284 

ctttcctttt ttaattccga tttttggcgg gttgecaatt aaccaaatta aatccatcct 1344 

taaaaaaaaa aaaaaaaaaa aaaaa 1369 

<210> 38 

<2ll> 356 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 38 

Met Gly Arg Ala Pro Cys Cys Glu Lys Val Gly lie Lys Lys Gly Arg 
1 5 10 15 

Trp Thr Ala Glu Glu Asp Arg Thr Leu Ser Asp Tyr lie Gin Ser Asn 

20 25 . 30 , 

Gly Glu Gly Ser Trp Arg Ser Leu Pro Lys Asn Ala Gly Leu Lys Arg 
35 40 45 

Cys Gly Lys Ser Cys Arg Leu Arg Trp lie Asn Tyr Leu Arg Ser Asp 
50 55 60 

He Lys Arg Gly Asn lie Thr Pro Glu Glu Glu Asp Val He Val Lys 

65 70 - . 75 Q0 

Leu His Ser Thr Leu Gly Thr Arg Trp Ser Thr He Ala Ser Asn Leu 
85 90 95 

Pro Gly Arg Thr Asp Asn Glu He Lys Asn Tyr Trp Asn Ser His Leu 
100 105 110 

Ser Arg Lys Leu His Gly Tyr Phe Arg Lys Pro Thr Val Ala Asn Thr 
115 120 125 

Val Glu Asn Ala Pro Pro Pro Pro Lys Arg Arg Pro Gly Arg Thr Ser 
130 135 140 

Arg Ser Ala Met Lys Pro Lys Phe He Leu Asn Pro Lys Asn His Lys 
145 150 155 160 

Thr Pro Asn Ser Phe Lys Ala Asn Lys Ser Asp He Val Leu Pro Thr 
165 170 175 

Thr Thr He Glu Asn Gly Glu Gly Asp Lys Glu Asp Ala Leu Met Val 
180 185 190 

Leu Ser Ser Ser Ser Leu Ser Gly Ala Glu Glu Pro Gly Leu Gly Pro 
195 200 205 • 

Cys Gly Tyr Gly Asp Asp Gly Asp Cys Asn Pro Ser He Asn Gly Asp 
210 215 220 
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Asp Gly Ala Leu Cys Leu Asn Asp Asp lie Phe Asp Ser Cys Phe Leu 
225 230. 235 . 240 

Leu Asp Asp Ser His Ala Val His Val Ser Ser Cys Glu Ser Asn Asn 
245 250 , 255 

Val Lys. Asn Ser Glu Pro Tyr Gly Gly Met Ser Val Gly His Lys Asn 
260 ' 265 270 

He Glu Thr Met Ala Asp Asp Phe Val Asp Trp Asp Phe Val Trp Arg 
275 280 285 

Glu Gly Gin Thr Leu Trp Asp Glu Lys Glu Asp Leu Asp Ser Val Leu 
290 295 . 300 

Ser Arg Leu Leu Asp Gly Glu Glu Met Glu Ser Glu He Arg Gin Arg 
305 310 315 320 

Asp Ser Asn Asp Phe Gly Glu Pro Leu Asp He Asp Glu Glu Asn Lys 
325 330 335 



Met Ala Ala Trp Leu Phe Ser Leu Lys lie Leu Pro Pro Ser Phe Ser 
340 345 350 



Leu Phe Pro Leu 





355 


<210> 


39 


<211> 


1046 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<221> 


CDS 


<222> 


(46) . . (867) 


<223> 


G233 


<400> 


39 



201 



gaaaaacatt tcaacttctt ttatcagcaa tcacaaatca aagag atg gga aga get 57 

Met Gly Arg Ala 
1 

cca tgc tgt gag aag atg ggg ttg aag aga gga cca tgg aca cct gaa 105 
Pro Cys Cys Glu Lys Met Gly Leu Lys Arg Gly Pro Trp Thr Pro Glu 
5 10 15 20 

gaa gat caa ate ttg gtc tct ttt ate etc aac cat gga cat agt aac 153 
Glu Asp Gin He Leu Val Ser Phe lie Leu Asn His Gly His Ser Asn 
25 30 35 

tgg cga gec etc cct aag caa get ggt ctt ttg aga tgt gga aaa age 
Trp Arg Ala Leu Pro Lys Gin Ala Gly Leu Leu Arg Cys Gly Lys Ser 
40 45 50 

tgt aga ctt *agg tgg atg aac tat tta aag cct gat* att aaa cgt ggc 24 9 

Cys Arg Leu Arg Trp Met Asn Tyr Leu Lys Pro Asp lie Lys Arg Gly 
55 60 65 . 

aat ttc acc aaa gaa gag gaa gat get ate ate age tta cac caa ata 297 
Asn Phe Thr Lys Glu Glu Glu Asp Ala lie lie Ser Leu His Gin lie 
70 75 80 

ctt ggc aat aga tgg tea gcg att gca gca aaa ctg cct gga aga acc 34 5 
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Leu Gly Asn Arg Trp Ser Ala He Ala Ala Lys Leu Pro Gly Arg Thr 
85 90 , 95 100 

gat aac gag ate aag aac gta tgg cac act cac ttg aag aag aga etc 393 
Asp Asn Glu lie Lys Asn Val Trp His Thr His Leu Lys Lys Arg Leu 
105 . 110 115. 

gaa gat tat caa cca get aaa cct aag ace age aac aaa aag aag ggt 441 
Glu Asp Tyr Gin Pro Ala Lys Pro Lys Thr Ser Asn Lys Lys Lys Gly 
120 125 130 

act aaa cca aaa tct gaa tec gta ata acg age teg aac agt act aga 489 
Thr Lys Pro Lys Ser Glu Ser Val He Thr Ser Ser Asn Ser Thr Arg 
135 140 145 

age gaa teg gag eta gca gat tea tea aac cct tct gga gaa age tta 537 
Ser Glu Ser Glu Leu Ala Asp Ser Ser Asn Pro Ser Gly Glu Ser Leu 
150 155 160 

ttt teg aca teg cct teg aca agt gag gtt tct teg atg aca etc ata 585 
Phe Ser Thr Ser Pro Ser Thr Ser Glu Val Ser Ser Met Thr Leu He 
165 170 175 180 

age cac gac ggc tat age aac gag att aat atg gat aac aaa ccg gga 633 
Ser His Asp Gly Tyr Ser Asn Glu He Asn Met Asp Asn Lys Pro Gly 
185 190 * 195 

gat ate agt act ate gat caa gaa tgt gtt tct ttc gaa act ttt ggt . 681 
Asp He Ser Thr He Asp Gin Glu Cys Val Ser Phe Glu Thr Phe Gly 
200 205 210 

gcg gat ate gat gaa age ttc tgg aaa gag aca ctg tat age caa gat 729 
Ala Asp He Asp Glu Ser Phe Trp Lys Glu Thr Leu Tyr Ser Gin Asp 
215 ~ 220 225 

gaa cac aac tac gta teg aat gac eta gaa gtc get ggt tta gtt gag 777 
Glu His Asn Tyr Val Ser Asn Asp Leu Glu Val Ala Gly Leu Val Glu 
230 235 240 

ata caa caa gag ttt caa aac ttg ggc tec get aat aat gag atg att 825 
lie Gin Gin Glu Phe Gin Asn Leu Gly Ser Ala Asn Asn Glu Met He 
245 250 255 260 

ttt gac agt gag atg gaa ctt ctg gtt cga tgt att ggc tag 867 
Phe Asp Ser Glu Met Glu Leu Leu Val Arg Cys lie Gly 
265 270 

aaceggeggg gaacaagatc tettagcegg gctctagtta acatgtttga ggagtaaagt 927 

gaaatggtgc aaattagtta aggctaagaa attcaaaagc ttttgtttac cgagaaaaaa 987 

acacactcta actcttgatg tgatgtagtt agtgtattaa ttagaggctg cgttttcaa 1046 

<210> 40 
<211> 273 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 40 

Met Gly Arg Ala Pro Cys Cys Glu Lys Met Gly Leu Lys Arg Gly Pro 
15 10 15 

Trp Thr Pro Glu Glu Asp Gin lie Leu Val Ser Phe lie Leu Asn His 
20 25 30 

Gly His Ser Asn Trp Arg Ala Leu Pro Lys Gin Ala Gly Leu Leu Arg 
35 40 45 

Cys Gly Lys Ser Cys Arg Leu Arg Trp Met Asn Tyr Leu Lys Pro Asp 
50 55 60 
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He Lys Arg Gly Asn Phe Thr . Lys Glu Glu Glu Asp Ala He He Ser 
65 70 75 80 

Leu His Gin He Leu Gly Asn Arg Trp Ser Ala He Ala Ala Lys Leu 
85 90 95 

Pro Gly Arg Thr Asp Asn Glu He Lys Asn Val Trp His Thr His Leu 
100 105 110 

Lys Lys Arg Leu Glu Asp Tyr Gin Pro Ala Lys Pro Lys Thr Ser Asn 
115 120 125 

Lys Lys Lys Gly Thr Lys Pro Lys Ser Glu Ser Val He Thr Ser Ser 
130 * 135 140 

Asn Ser Thr Arg Ser Glu Ser Glu Leu Ala Asp Ser Ser Asn Pro Ser 
145 150 155 160 

Gly Glu Ser Leu Phe Ser Thr Ser Pro Ser Thr Ser Glu Val Ser Ser 
165 170 175 

Met Thr Leu He Ser His Asp Gly Tyr Ser Asn Glu He Asn Met Asp 
180 185 190 

Asn Lys Pro Gly Asp He Ser Thr He Asp Gin Glu Cys Val Ser Phe 
195. . 200 205 

•Glu Thr Phe Gly Ala Asp He Asp Glu Ser Phe Trp Lys Glu Thr Leu 
210 215 220 

Tyr Ser Gin Asp Glu His Asn Tyr Val Ser Asn Asp Leu Glu Val Ala 
225 230 235 240 

Gly Leu Val Glu He Gin Gin Glu Phe Gin Asn Leu Gly Ser Ala Asn 
245 250 255 

Asn Glu Met He Phe Asp Ser Glu Met Glu Leu Leu Val Arg Cys He 
260 265 270 



Gly 



<210> 41 

<211> 1262 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (217) . . (957) 

<223> G463 

<400> 41 

ctcgagctac gtcaggggtc tctttctgtt tgtttgtttt cttgtttcct tctctctctc 60 

tctttctttc tttgtcttcc tttcccaggt tgtttttttt tgctctctct gccttcttga 120 

ctttcaaaag actctttctt tcttttggat tgattttgga ttctagggct ctctttcttt 180 
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tagtgggttt ttgttgttgt tgttgtggtc tctctg atg att act gaa ctt gag 

Met He Thr Glu Leu Glu 
1 5 

atg ggg aaa ggt gag agt gag ctt gag ctt ggt eta ggg ctg agt ctt 

Met Gly Lys Gly Glu Ser Glu Leu Glu Leu Gly Leu Gly Leu Ser Leu 
10 15 20 



234 



282 



ggc ggt gga acg gcg gec aag att ggt aaa tea ggt ggt ggt ggc gcg 
Gly Gly Gly Thr Ala Ala Lys He Gly Lys Ser Gly Gly Gly Gly Ala 
25 30 - 35 



330 



tgg gga gag eg t gga agg ctt ttg acg get aag gat ttt cct tct gtt 378 

Trp Gly Glu Arg Gly Arg Leu Leu Thr Ala Lys Asp Phe Pro Ser Val 
40 45 50 

ggt tct aaa cgt get get gat tct get tct cat get ggt tea tct cct 426 

Gly Ser Lys Arg Ala Ala Asp Ser Ala Ser His Ala Gly Ser Ser Pro 
55 60 65 70 

cct cgt tea agt caa gtt gtt gga tgg cct cct ata ggg tea cac agg 474 

Pro Arg Ser Ser Gin Val Val Gly Trp Pro Pro He Gly Ser His Arg 

75 80 85 



atg aac agt ttg gtt aat aac caa get aca aag tea gca aga gaa gaa 
Met Asn Ser Leu Val Asn Asn Gin Ala Thr Lys Ser Ala Arg Glu Glu 
90 95 100 



522 



gaa gaa get ggt aag aag aaa gtg aaa gat gat gaa cct aaa gat gtg 
Glu Glu "Ala Gly Lys Lys Lys Val Lys Asp Asp Glu Pro Lys Asp Val 
105 110 115 

aca aag aaa gtg aat ggg aaa gta caa gtt gga ttt att aag gtg aac 
Thr Lys Lys Val Asn Gly Lys Val Gin Val Gly Phe He Lys Val Asn 
120 125 130 



570 



618 



atg gat gga gtt get ata gga aga aaa gtg gat ttg aat get cat tct 
Met Asp Gly Val Ala He Gly Arg Lys Val Asp Leu Asn Ala His Ser 
135 140 145 150 

tct tac gag aat ttg gcg caa aca ttg gaa gat atg ttc ttt cgc act 
Ser Tyr Glu Asn Leu Ala Gin Thr Leu Glu Asp Met Phe Phe Arg Thr 
155 160 165 



666 



714 



aat ccg ggt act gtc ggg tta acc agt cag ttc act aaa ccg ttg agg 
Asn Pro Gly Thr Val Gly Leu Thr Ser Gin Phe Thr Lys Pro Leu Arg 
170 ^ 175 180 



762 



ctt tta gat gga teg tct gag ttt gta ctt act tat gaa gat aag gaa 
Leu Leu Asp Gly Ser Ser Glu Phe Val Leu Thr Tyr Glu Asp Lys Glu 
185 190 195 

gga gat tgg atg ctt gtt ggt gat gtt cca tgg aga atg ttc ate aac 
Gly Asp Trp Met Leu Val Gly Asp Val Pro Trp Arg Met Phe He Asn 
200 205 * 210 

teg gtg aaa agg eta cgt gtg atg aaa acc tct gaa get aat gga etc 
Ser Val Lys Arg Leu Arg Val Met Lys Thr Ser Glu Ala Asn Gly Leu 
215 220 225 230 

get gca cga aat caa gaa cca aac gag aga cag cga aag cag ccg gtt 
Ala Ala Arg Asn Gin Glu Pro Asn Glu Arg Gin Arg Lys Gin Pro Val 
235 240 245 



810 



858 



906 



954 



tag atctcttttc gaegttaegg tgttacaggt tttatatttt ggggttttgc 1007 

aagtctgaga tacttctgaa gcaagcataa gctagattga tcttatatcc agtttgtgta 1067 

ttttcttggt tcttataatg gtttttactg gttttcttta gttttttttt ttgctgtctt 1127 

ttaattttcg gttgcgattt cactatatac tatggatgga agagaatget ctttatatct 1187 

tttactacac tgtaaatatt tgaagcttat etaatategt ttttaagggt taaaaaaccc 1247 
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1262 



tgacgtagcc tcgag 



<210> 
<211> 
<212> 
<213> 



Arabidopsis thaliana 



42 
246 
PRT 



<400> 



42 



Met lie Thr Glu Leu Glu Met Gly Lys Gly Glu Ser Glu Leu Glu Leu 
1 5 10 15 



Gly Leu Gly Leu Ser Leu Gly Gly Gly Thr Ala Ala Lys He Gly Lys 
20 25 30 



Ser Gly Gly Gly Gly Ala Trp Gly Glu Arg Gly Arg Leu Leu Thr Ala 
35 40 45 



Lys Asp Phe Pro Ser Val Gly Ser Lys Arg' Ala Ala Asp Ser Ala Ser 
50 55 60 



His Ala Gly Ser Ser Pro Pro Arg Ser Ser Gin Val Val Gly trp Pro 
65 70 75 80 



Pro lie Gly Ser His Arg Met Asn Ser Leu Val Asn Asn Gin Ala Thr 
85 90 95 



Lys Ser Ala Arg Glu Glu Glu Glu Ala Gly Lys Lys Lys Val Lys Asp 
100 105 110 



Asp Glu Pro Lys Asp Val Thr Lys Lys Val Asn Gly Lys Val Gin Val 
115 120 125 



Gly Phe He Lys Val Asn Met Asp Gly Val Ala He Gly Arg Lys Val 
130 135 140 



Asp Leu Asn Ala His Ser Ser Tyr Glu Asn Leu Ala Gin Thr Leu Glu 
145 150 155 160 



Asp Met Phe Phe Arg Thr Asn Pro Gly Thr Val Gly Leu Thr Ser Gin 
165 170 175 



Phe Thr Lys Pro Leu Arg Leu Leu Asp Gly Ser Ser Glu Phe Val Leu 
180 185 190 



Thr Tyr Glu Asp Lys Glu Gly Asp Trp Met Leu Val Gly Asp Val Pro 
195 200 205 



Trp Arg Met Phe He Asn Ser Val Lys Arg Leu Arg Val Met Lys Thr 
210 215 220 



Ser Glu Ala Asn Gly Leu Ala Ala Arg Asn Gin Glu Pro Asn Glu Arg 
225 230 235 240 



Gin Arg Lys Gin Pro Val 



245 



<210> 43 
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<211> 741 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (1) . . (741) 

<223> G2422 

<400> 43 

atg ggc gaa tea ccc aaa ggg ttg aga aaa ggt aca tgg act act gaa 4 8 

Met Gly GluSer Pro Lys Gly Leu Arg Lys Gly Thr Trp Thr Thr Glu 
1 5 10 15 

gaa gat att etc ttg agg caa tgc att gat aag tat gga gaa ggc aaa 96 
Glu Asp lie Leu Leu Arg Gin Cys lie Asp Lys Tyr Gly Glu Gly Lys 
20 25 30 

tgg cat cga gtt cct tta aga act ggt etc aat egg tgc cga aag agt 144 
Trp His Arg Val Pro Leu Arg Thr Gly Leu Asn Arg Cys Arg Lys Ser 
35 40 45 

tgt aga ctt aga tgg ttg aat tat ttg aag cca agt att aag aga gga 192 
Cys Arg Leu Arg Trp Leu Asn Tyr Leu Lys Pro Ser lie Lys Arg Gly 
50 55 60 

aaa etc tgc tec gat gaa gtt gat ctt gtt ctt cgc ctt cat aaa ctt 240 
Lys Leu Cys Ser Asp Glu Val Asp Leu Val Leu Arg Leu His Lys Leu 
65 70 75 80 

eta gga aat agg tgg tec ttg ate get ggt aga ttg cct ggt egg act 288 
Leu Gly Asn Arg Trp Ser Leu lie Ala Gly Arg Leu Pro Gly Arg Thr 
85 90 95 

get aat gat gtc aag aat tac tgg aac act cat ttg agt aag aag cac 336 
Ala Asn Asp Val Lys Asn Tyr Trp Asn Thr His Leu Ser Lys Lys His 
100 105 110 

gat gaa cga tgc tgt aag acg aag atg ata aac aaa aac att act tct * 3 84 
Asp Glu Arg Cys Cys Lys Thr Lys Met lie Asn Lys Asn lie Thr Ser 
115 * * 120 125 

cat cct act tea teg gee caa aaa ate gat gtt tta aag cct egg cct 432 
His Pro Thr Ser Ser Ala Gin Lys lie Asp Val Leu Lys Pro Arg Pro 
130 135 140 

cga tec ttc tec gat aaa aat agt tgc aac gat gtc aat ate ttg cca 480 
Arg Ser Phe Ser Asp Lys Asn Ser Cys Asn Asp Val Asn lie Leu Pro 
145 150 155 160 

aaa gtt gac gtt gtt cct tta cat ctt gga etc aac aac aat tat gtt 528 
Lys Val Asp Val Val Pro Leu His Leu Gly Leu Asn Asn Asn Tyr Val 
165 170 175 

tgt gaa agt agt att aca tgt aac aaa gat gag caa aaa gat aag ctt 576 
Cys Glu Ser Ser He Thr Cys Asn Lys Asp Glu Gin Lys Asp Lys Leu 
180 185 190 

att aat att aat eta ttg gat gga gat aat atg tgg tgg gaa agt tta 624 
He Asn He Asn Leu Leu Asp Gly Asp Asn Met Trp Trp Glu Ser Leu 
195 " 200 " 205 

ctg gag gca gat gtg ttg ggt cca gaa get acg gaa aca gca aag ggt 672 
Leu Glu Ala Asp Val Leu Gly Pro Glu Ala Thr Glu Thr Ala Lys Gly 
210 215 220 

gtg ace tta ccg ctt gac ttt gag caa att tgg get egg ttt gat gaa 720 
Val Thr Leu Pro Leu Asp Phe Glu Gin He Trp Ala Arg Phe Asp Glu 
225 230 . 235 240 

gag act tta gaa ctg aat tag 741 
Glu Thr Leu Glu Leu Asn 
245 
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<210> 44 
<211> 246 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 44 

Met Gly Glu Ser Pro Lys Gly Leu Arg Lys Gly Thr Trp Thr Thr Glu 
, 1 5 . - 10 15 

Glu Asp He Leu Leu Arg Gin Cys He Asp Lys Tyr Gly Glu Gly Lys 
20 25 30 

Trp His Arg Val Pro Leu Arg Thr Gly Leu Asn Arg Cys Arg Lys Ser 
35 40 45 

Cys Arg Leu Arg Trp Leu Asn Tyr Leu Lys Pro Ser He Lys Arg Gly 
50 55 60 

Lys Leu Cys Ser Asp Glu Val Asp Leu Val Leu Arg Leu His Lys Leu 
65 70 75 80 

Leu Gly Asn Arg Trp Ser. Leu He Ala Gly Arg Leu Pro Gly Arg Thr 
85 90 95 

Ala Asn Asp Val Lys Asn Tyr Trp Asn Thr His Leu Ser Lys Lys His 
100 105 110 

Asp Glu Arg Cys Cys Lys Thr Lys Met He Asn Lys Asn He Thr Ser 
115 120 125 . 

His Pro Thr Ser Ser Ala Gin Lys He Asp Val Leu Lys Pro Arg Pro 
130 135 140 

Arg Ser Phe Ser Asp Lys Asn Ser Cys Asn Asp Val Asn He Leu Pro 
145 150 155 160 

Lys Val Asp Val Val Pro Leu His Leu Gly Leu Asn Asn Asn Tyr Val 
165 170 175 

Cys Glu Ser Ser He Thr Cys Asn Lys Asp Glu Gin Lys Asp Lys Leu 
180 185 190 

He Asn He Asn Leu Leu Asp Gly Asp Asn Met Trp Trp Glu Ser Leu 
195 200 205 

Leu Glu Ala Asp Val Leu Gly Pro Glu Ala Thr Glu Thr Ala Lys Gly 
210 215 220 

Val Thr Leu Pro Leu Asp Phe Glu Gin lie Trp Ala Arg Phe Asp Glu 
225 230 235 240 



Glu Thr Leu Glu Leu Asn 

* 245 



<210> 45 
<211> 762 
<212> DNA 
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<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (1) . . (630) 

<223> G2421 



<400> 45 



atg 
Met 
1 


gag 
Glu 


99 1 
Gly 


teg 
Ser 


tec 
Ser 
5 


aaa 
Lys 


999 
Gly 


ttg 
Leu 


agg 
Arg 


aaa 
Lys 
10 


99 1 
Gly 


gca 
Ala 


tgg act get gaa 
Trp Thr Ala Glu 
15 


48 


gaa gat 
Glu Asp 


agt 
Ser 


etc 
Leu 
20 


ttg 
Leu 


agg 
Arg 


cag 
Gin 


tgt 
Cys 


att 
He 

25 


99 1 
Gly 


aag 
Lys 


tat 
Tyr 


gga gaa ggc aaa 
Gly Glu Gly Lys 
30 


96 


tgg cat 
Trp His 


caa 
Gin 
35 


gtt 
Val 


cct 
Pro 


tta 
Leu 


aga 
Arg 


get 
Ala 
40 


999 
Gly 


eta 
Leu 


aat 
Asn 


egg 
Arg 


tgc agg 
Cys Arg 
45 


aaa 
Lys 


agt 
Ser 


144 


tgt 

Cys 


aga 
Arg 
50 


eta 
Leu 


aga 
Arg 


tgg 
Trp 


tta 
Leu 


aac 
Asn 
55 


tat 
Tyr 


ttg 
Leu 


aag 
Lys 


cca 
Pro 


agt 
Ser 
60 


ate aag 
lie Lys 


aga 
Arg 


gga 
Gly 


192 


aaa 
Lys 
65 


ttt 
Phe 


aqt 
Ser 


tct gat gaa 
Ser Asp Glu 
70 


qtt 

val 


qat 
Asp 


ctt 
Leu 


ctt 
Leu 


ctt 
Leu 
75 


cqt 
Arg 


ctt cat 
Leu His 


aag 
Lys 


ctt 
Leu 
80 


240 


eta 
Leu 


gga 
Gly 


aat 
Asn 


agg 
Arg 


tgg 
Trp 
85 


tec 
Ser 


ttg 
Leu' 


att 
He 


get 
Ala 


ggt 
Gly 
90 


cga 
Arg 


tta 
Leu 


cct ggt egg 
Pro Gly Arg 
95 


acc 
Thr 


288 


get 
Ala 


aat 
Asn 


gat 

Asp 


gtc 
Val 
100 


aag aac 
Lys Asn 


tac 
Tyr 


tgg 
Trp 


aac 
Asn 
105 


ace 
Thr 


cat 
His 


ctg 
Leu 


agt aag 
Ser Lys 
110 


aag 
Lys 


cat 
His 


336 


gaa 
Glu 


ccg 
Pro 


tqt 
Cys 
115 


tgt 
Cys 


aaa 
Lys 


act 
Thr 


aaq 
Lys 


ata 
He 
120 


aaa 
Lys 


aqq 
Arg 


ata 
He 


aat 
Asn 


att ata 
He lie 
125 


ace 
Thr 


cct 
Pro 


384 


cct 
Pro 


aat 
Asn 
130 


aca 
Thr 


ccg 
Pro 


gec 
Ala 


caa 
Gin 


aaa 
Lys 
135 


gtt 
Val 


tgt 
Cys 


gaa 
Glu 


aat 
Asn 


agt 
Ser 
140 


ate aca 
He Thr 


tgt aac 
Cys Asn 


432 


aaa gat 
Lys Asp 

145 


gat 
Asp 


gag 
Glu 


aaa 
Lys 


gat 
Asp 
150 


gat 
Asp 


ttt 
Phe 


gtg 
Val 


gat 
Asp 


aat 
Asn 
155 


ttt 
Phe 


atg gtt 
Met Val 


gga gat 
Gly Asp 
160 


480 


aat 
Asn 


ata 
He 


tgg 
Trp 


ttg 
Leu 


gag 
Glu 
165 


cgt 
Arg 


ttg 
Leu 


eta 
Leu 


gac 
Asp 


gag 
Glu 
170 


99c 
Gly 


caa 
Gin 


gag gta gat gtg 
Glu Val Asp Val 
175 


528 


ctg 
Leu 


gtt 
Val 


aca 
Thr 


gaa 
Glu 
180 


gcg 
Ala 


gcg 
Ala 


gca 

Ala 


aca 
Thr 


gaa 
Glu 
185 


aag 
Lys 


gag 
Glu 


99C 
Gly 


act ttg 
Thr Leu 
190 


gcg 
Ala 


ttt 
Phe 


576 


gac gtt 
Asp Val 


9*g 
Glu 
195 


caa 
Gin 


ctt 
Leu 


tgg 
Trp 


aat 
Asn 


ttg 
Leu 
200 


ttc 
Phe 


gat 
Asp 


gga 
Gly 


gag 
Glu 


act gtg 
Thr Val 
205 


ate 
He 


ttt 
Phe 


624 



gat tag tgtttataaa cgtttgtgtt ctcttgtttg tgaggtttct ctatttaatt 680 
Asp 

tagtatctat tttctaaatt aactaatatc ttatagtatt ttaggcaaac cttatgtttc 740 

cgtttctgtg cggccgctct ag 762 

<210> 46 
<211> 209 
<212> PRT 

<213> Arabidopsis thaliana 
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■5 . MBI-17 Sequence Listing. ST25 

5 • • <400> 46 . 

I Met Glu Gly Ser Ser Lys Gly Leu Arg Lys Gly Ala Trp Thr Ala Glu 

! - 1 5 10 15 



Glu Asp Ser Leu Leu Arg Gin Cys lie Gly Lys; Tyr Gly Glu Gly Lys 
20 25 30 



S Trp His Gin Val Pro Leu Arg Ala Gly Leu Asn Arg Cys Arg Lys Ser 
I 35 40 45 

\ " ' ' ' 

; Cys Arg Leu Arg Trp Leu Asn Tyr Leu Lys Pro Ser lie Lys Arg Gly 

j . 50 . 55 60 

* Lys Phe Ser Ser Asp Glu Val Asp Leu Leu Leu Arg Leu His Lys Leu 

j 65 70 75 80 



i Leu Gly Asn Arg Trp Ser Leu He Ala Gly Arg Leu Pro Gly Arg Thr 

j 85 90 . 95 



Ala Asn Asp Val Lys Asn Tyr Trp Asn Thr His Leu Ser Lys Lys His 
100 . 105 HO 



Glu Pro Cys Cys Lys Thr Lys He Lys Arg He Asn lie He Thr Pro 
115 120 125 



Pro Asn Thr Pro Ala Gin Lys Val Cys Glu Asn Ser He Thr Cys Asn 
130 135 140 



Lys Asp Asp Glu Lys Asp Asp Phe Val Asp Asn Phe Met Val Gly Asp 
145 ISO 155 160 



Asn He Trp Leu Glu Arg Leu Leu Asp Glu Gly Gin Glu Val Asp Val 
165 170 175 



Leu Val Thr Glu Ala Ala Ala Thr Glu Lys Glu Gly Thr Leu Ala Phe 
180 185 190 



Asp Val Glu Gin Leu Trp Asn Leu Phe Asp Gly Glu Thr Val He Phe 
195 200 205 



Asp 



<210> 47 

<211> 1665 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (33) . . (1376) 

<223> G772 



<400> 47 

aaaaaccaaa accggttttt ttttttcttt tt atg ggt cgc gaa tct ctg get 53 

Met Gly Arg Glu Ser Leu Ala 
1 5 

gtt gtg tec teg ccg cca teg gcg act gcg ccc agt act get gtg teg 101 
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Val Val Ser Ser Pro Pro Ser Ala Thr Ala Pro Ser Thr Ala Val Ser 
10 15 20 

get acc teg ctt get cct ggc ttt cga ttt cat ccg act gat gag gaa 149 
Ala Thr Ser Leu Ala Pro Gly Phe Arg Phe His Pro Thr Asp Glu Glu 
25 30 35 

etc gtg age tat tac ttg aag agg aag gtt ctg ggt aaa cct gta cgc 197 
Leu Val Ser Tyr Tyr Leu Lys Arg Lys Val Leu Gly Lys Pro Val Arg 
40 45 50 55 

ttc gat gcg att gga gag gta gat ate tac aag cat gag ccc tgg gat 245 
Phe Asp Ala lie Gly Glu Val Asp lie Tyr Lys His Glu Pro Trp Asp 
60 65 70 

tta gca gtg ttt teg aag ttg aaa act egg gac caa gaa tgg tac ttc 293 
Leu Ala Val Phe Ser Lys Leu Lys Thr Arg Asp Gin Glu Trp Tyr Phe 
75 80 85 

ttc agt gcg tta gat aag aag tac ggt aat ggt get agg atg aat cga 341 
Phe Ser Ala Leu Asp Lys Lys Tyr Gly Asn Gly Ala Arg Met Asn Arg 
90 95 100 

gca act aac aaa ggg tac tgg aaa gca act gga aaa gac aga gaa ate 389 
Ala Thr Asn Lys Gly Tyr Trp Lys Ala Thr Gly Lys Asp Arg Glu lie 
105. 110 115 

cgc egg gat att cag ttg etc ggt atg aaa aag acg ctt gtt ttc cac 437 
Arg Arg Asp He Gin Leu Leu Gly Met Lys Lys Thr Leu Val Phe His 
120 125 130 135 

age ggg cgt get oca gac ggc ctt egg act aat tgg gtc atg cac gag 485 
Ser Gly Arg Ala Pro Asp Gly Leu Arg Thr Asn Trp Val Met His Glu 
140 145 150 

tat cgc ctt gtg gaa tat gaa act gaa act aac gga age ctg ctg cag 533 
Tyr Arg Leu Val Glu Tyr Glu Thr Glu Thr Asn Gly Ser Leu Leu Gin 
155 160 165 

gat gca tat gtg ttg tgc aga gtg ttt cac aag aat aac att ggg cca 581 
Asp Ala Tyr Val Leu Cys Arg Val Phe His Lys Asn Asn He Gly Pro 
170 175 180 

cca agt ggg aac aga tat gcg cca ttc atg gaa gaa gaa tgg get gat 629 
Pro Ser Gly Asn Arg Tyr Ala Pro Phe Met Glu Glu Glu Trp Ala Asp 
185 ' * 190 195 

ggt gga gga get ctg att cca gga ata gac gtt agg gtc agg gta gag 677 
Gly Gly Gly Ala Leu He Pro Gly He Asp Val Arg Val Arg Val Glu 
200 ' 205 210 215 

get eta cca caa gee aat gga aac aac cag atg gac cag gaa atg cat 725 
Ala Leu Pro Gin Ala Asn Gly Asn Asn Gin Met Asp Gin Glu Met His 
220 225 230 

tea gca age aag gat etc att aac ate aac gag eta ccg aga gat get 773 
Ser Ala Ser Lys Asp Leu He Asn He Asn Glu Leu Pro Arg Asp Ala 
235 240 245 

act cca atg gac ate gaa cct aac caa cag aat cat cat gag agt gec 821 
Thr Pro Met Asp He Glu Pro Asn Gin Gin Asn His His Glu Ser Ala 
250 255 260 

ttc aag cca cag gag agt aac aac cat agt ggt tat gaa gaa gat gag 869 
Phe Lys Pro Gin Glu Ser Asn Asn His Ser Gly Tyr Glu Glu Asp Glu 
265 270 275 

gac aca etc aaa cgc gag cac gca gaa gaa gat gag cgt cct cct tct 917 
Asp Thr Leu Lys Arg Glu His Ala Glu Glu Asp Glu Arg Pro Pro Ser 
280 ~ 285 290 295 

eta tgc att etc aac aaa gaa get cca eta cct etc ctg caa tac aaa 965 
Leu Cys He Leu Asn Lys Glu Ala Pro Leu Pro Leu Leu Gin Tyr Lys 
300 305 310 
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cgt aga cgc caa aac gag tec aac aac aac tea age agg aac aca cag 
Arg Arg Arg Gin Asn Glu Ser Asn Asn Asn Ser Ser Arg Asn Thr Gin 
315 '320 325 

gac cat tgt teg tec aca ata aca acc gtc gac aat aca acc acc tta 
Asp His Cys Ser Ser Thr He Thr Thr Val Asp Asn Thr Thr Thr Leu 
330 335 340 

ate tea tea tct get get get get acc aac act gee ate tct gca ttg 
lie Ser Ser Ser Ala Ala Ala Ala Thr Asn Thr Ala He Ser Ala Leu 
345 350 355 

ctt gag ttc. tea ctt atg ggt ate tec gac aag aaa gaa aac cag cag 
Leu Glu Phe Ser Leu Met Gly He Ser Asp Lys Lys Glu Asn Gin Gin 
360 365 370 375 

aaa gag gaa act tct cct cct agt cca att gca tct cct gaa gag aag 
Lys Glu Glu Thr Ser Pro Pro Ser Pro. He Ala Ser Pro Glu Glu Lys 
380 385 390 

gtt aat gat etc cag aag gag gtt cac cag atg tct gtt gaa aga gaa 
Val Asn Asp Leu Gin Lys Glu Val His Gin Met Ser Val Glu Arg Glu 
395 400 405 

act ttc aag ctt gag atg atg agt gca gag get atg ate age att etc 
Thr Phe Lys Leu Glu Met Met Ser Ala Glu Ala Met He Ser He Leu 
410 415 420 

cag tea aga ate gat gcg ctg cgt cag gag aac gag gaa ctt aag aag 
Gin Ser Arg He Asp Ala Leu Arg Gin Glu Asn Glu Glu Leu Lys Lys 
425 430 435 

aag aac gee agt gga caa get agt taa accaccgcaa catctctcca 
Lys Asn Ala Ser Gly Gin Ala Ser 
440 445 

ggtgtcttct tcttcttctt cttcttcttt gcctcttagc tgtaatcttc ttaatagtat 
gagctatgga tgtagcttct teagaeggat cagaaacctt atgaatctct gttgtaaaat 
taggataaaa eggaaeggag ccaaccaact aggtcttttt attttatcct tttttacttt 
ggatgtttct gcatcttttg ggaacatttt caggctgatc cattgtcgta tattatcatc. 
tatctatcta gtcttttcag acaaaaaaa 



1013 

1061 

1109 

1157 

1205 

1253 

1301 

1349 

1396 

1456 
1516 
1576 
1636 
1665 



<210> 48 
<211> 447 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 48. 

Met Gly Arg Glu Ser Leu Ala Val Val Ser Ser Pro Pro Ser Ala Thr 
15 10 15 

Ala Pro Ser Thr Ala Val Ser Ala Thr Ser Leu Ala Pro Gly Phe Arg 
20 25 30 

Phe His Pro Thr Asp Glu Glu Leu Val Ser Tyr Tyr Leu Lys Arg Lys 
35 40 45 

Val Leu Gly Lys Pro Val Arg Phe Asp Ala He Gly Glu Val Asp He 
50 55 60 

Tyr Lys His Glu Pro Trp Asp Leu Ala Val Phe Ser Lys Leu Lys Thr 
65 70 75 80 

Arg Asp Gin Glu Trp Tyr Phe Phe Ser Ala Leu Asp Lys Lys Tyr Gly 
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85 90 95 



Asn Gly Ala Arg Met Asn Arg Ala Thr Asn Lys Gly Tyr Trp Lys Ala 
100 105 110 



Thr Gly Lys Asp Arg Glu lie Arg Arg Asp lie Gin Leu Leu Gly Met 
115 120 125 



Lys Lys Thr Leu Val Phe His Ser Gly Arg Ala Pro Asp Gly Leu Arg 
130 135 140 



Thr Asn Trp Val Met His Glu Tyr Arg Leu Val Glu Tyr Glu Thr Glu 
145 150 155 160 



Thr Asn Gly Ser Leu Leu Gin Asp Ala Tyr Val Leu Cys Arg Val Phe 
165 170 175 



His Lys Asn Asn lie Gly Pro Pro Ser Gly Asn Arg Tyr Ala Pro Phe 
180 * 185 190 



Met Glu Glu Glu Trp Ala Asp Gly Gly Gly Ala Leu He Pro Gly He 
195 200 205 



Asp Val Arg Val Arg Val Glu Ala Leu Pro Gin Ala Asn Gly Asn Asn 
210 215 220 



Gin Met Asp Gin Glu Met His Ser Ala Ser Lys Asp Leu He Asn He 
225 230 235 * 240 



Asn Glu Leu Pro Arg Asp Ala Thr Pro Met Asp He Glu Pro Asn Gin 
245 250 255 

Gin Asn His His Glu Ser Ala Phe Lys Pro Gin Glu Ser Asn Asn His 
260 265 270 



Ser Gly Tyr Glu Glu Asp Glu Asp Thr Leu Lys Arg Glu His Ala Glu 
275 280 285 



Glu Asp Glu Arg Pro Pro Ser Leu Cys He Leu Asn Lys Glu Ala Pro 
290 295 300 

Leu Pro Leu Leu Gin Tyr Lys Arg Arg Arg Gin Asn Glu Ser Asn Asn 
305 310 315 320 



Asn Ser Ser Arg Asn Thr Gin Asp His Cys Ser Ser Thr He Thr Thr 
325 330 335 



Val Asp Asn Thr Thr Thr Leu He Ser Ser Ser Ala Ala Ala Ala Thr 
340 345 350 



Asn Thr Ala He Ser Ala Leu Leu Glu Phe Ser Leu Met Gly He Ser 

355 360 365 

Asp Lys Lys Glu Asn Gin Gin Lys Glu Glu Thr Ser Pro Pro Ser Pro 

370 375 380 
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MBI-17 Sequence Listing. ST25 
lie Ala Ser Pro Glu Glu Lys Val Asn Asp Leu Gin Lys Glu Val His 
385 390 395 .400 

Gin Met Ser Val Glu Arg Glu Thr Phe Lys Leu Glu Met Met Ser Ala 
405 410 415 

Glu Ala Met He Ser He Leu Gin Ser Arg He Asp Ala Leu Arg Gin 
420 425 430 

Glu Asn Glu Glu Leu Lys Lys Lys Asn Ala Ser Gly Gin Ala Ser 

440 445 





435 


<210> 


49 


<211> 


1198 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(56) . . (1021) 


<223> 


G866 


<400> 


49 



aaaaaaaact tgcacatctt ctcagatctt caagtttctc ctctggtttc tcatc atg 

1 

acc gtt gat att atg cgt tta cct aag atg gaa gat caa acg get ata 

Thr Val Asp He Met Arg Leu Pro Lys Met Glu Asp Gin Thr Ala He 
5 10 15 

caa gaa get gca tea caa ggc tta aaa age atg gaa cac ttg att cgt 

- Gin Glu Ala Ala Ser Gin Gly Leu Lys Ser Met Glu His Leu He Arg 
20 25 30 

gtc etc tct aac cgt ccc gaa gaa cgt aac gtt gat tgc tct gag ate 

Val Leu Ser Asn Arg Pro Glu Glu Arg Asn Val Asp Cys Ser Glu He 
35 ~ 40 45 

act gat ttc aca gtt tct aag ttc aag aaa gtt ate tct ctt ctt aac 

Thr Asp Phe Thr Val Ser Lys Phe Lys Lys Val He Ser Leu Leu Asn 
50 55 60 65 

cgt tec ggt cac gee egg ttt aga cgt ggt ccg gtt cat tec cct cct 

Arg Ser Gly His Ala Arg Phe Arg Arg Gly Pro Val His Ser Pro Pro 

70 75 80 

tec tec tec gtt cct cca ccg gtg aaa gtg aca act ccg get ccc act 

Ser Ser Ser Val Pro Pro Pro Val Lys Val Thr Thr Pro Ala Pro Thr 
85 90 95 

cag ate tct get cca gca ccg gtt age ttc gtt cag gca aat caa caa 

Gin lie Ser Ala Pro Ala Pro Val Ser Phe Val Gin Ala Asn Gin Gin 
100 105 HO 

age gtg acg tta gat ttc act aga ccg age gtt ttt ggc get aaa acc 

Ser Val Thr Leu Asp Phe Thr Arg Pro Ser Val Phe Gly Ala Lys Thr 
115 120 125 

aag age teg gag gtt gtt gag ttt get aaa gag age ttt age gta tct 

Lys Ser Ser Glu Val Val Glu Phe Ala Lys Glu Ser Phe Ser Val Ser 
130 135 140 145 

tct aac tct tct ttc atg tct tct gcg ate acc ggt gat gga agt gtc 

Ser Asn Ser Ser Phe Met Ser Ser Ala He Thr Gly Asp Gly Ser Val 

150 155 160 

tct aaa ggc tct teg ate ttt ctt get ccg get cca gcg gtg cca gtg 

Ser Lys Gly Ser Ser He Phe Leu Ala Pro Ala Pro Ala Val Pro Val 
165 170 175 
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act tec tec ggg aaa ccg ccg ctt tct ggt ctt cct tac agg aag aga 634 
Thr Ser Ser Gly Lys Pro Pro Leu Ser Gly Leu Pro Tyr Arg Lys Arg 
180 185 . 190 

tgc ttt gaa cat gac cac tct gaa ggc ttt tec ggc aag ate tct ggc 682 
Cys Phe Glu His Asp His Ser Glu Gly Phe Ser Gly Lys lie Ser Gly 
195 200 205 

tec ggc aac ggc aag tgc cat tgc aag aaa age cga aaa aat egg atg 730 
Ser Gly Asn Gly Lys Cys His Cys Lys Lys Ser Arg Lys Asn Arg Met 
210 215 220 225 

aag aga acc gtg aga gta.ccg gcg gta agt gca aag ate gee gat ata 778 
Lys Arg Thr Val Arg Val Pro Ala Val Ser Ala Lys lie Ala Asp He 
230 235 240 

cca cca gac gaa tat tea tgg aga aag tat gga caa aaa ccg ate aaa 826 
Pro Pro Asp Glu Tyr Ser Trp Arg Lys Tyr Gly Gin Lys Pro He Lys 
245 250 255 

ggc tea cca cat cca egg ggt tat tac aag tgt agt aca ttt aga gga 874 
Gly Ser Pro His Pro Arg Gly Tyr Tyr Lys Cys Ser Thr Phe Arg Gly 
260 265 270 

tgt cca gcg agg aaa cac gtg gaa aga get ttg gat gat tea acg atg 922 
Cys Pro Ala Arg Lys His Val Glu Arg Ala Leu Asp Asp Ser Thr Met 
275 280 285 

ttg att gtg acg tac gaa gga gag cac cgt cat cac cag tec acg atg 970 
Leu He Val Thr Tyr Glu Gly Glu His Arg His His Gin Ser Thr Met 
290 295 300 305 

cag gag cat gta act cct age gtg agt ggt ttg gtg ttt ggt teg get 1018 
Gin Glu His Val Thr Pro Ser Val Ser Gly Leu Val Phe Gly Ser Ala 
310 315 320 

tga agaattaatt agtttggtag ttttgtaata ttttgagaaa tagaggggtt 1071 

ggttttgtaa ttttttttct ataacaaaat tagttttaga ttttttttta gtagtctttt 1131 

gaatggattt taatcttact accgagaaag aaaaaattct tactacattt tcaaaaaaaa 1191 

aaaaaaa 1198 

<210> 50 
<211> 321 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 50 

Met Thr Val Asp He Met Arg Leu Pro Lys Met Glu Asp Gin Thr Ala 
15 10 15 



He Gin Glu Ala Ala Ser Gin Gly Leu Lys Ser Met Glu His Leu He 
20 25 30 



Arg Val Leu Ser Asn Arg Pro Glu Glu Arg Asn Val Asp Cys Ser Glu 
35 40 45 



lie Thr Asp Phe Thr Val Ser Lys Phe Lys Lys Val He Ser Leu Leu 
50 55 * 60 



Asn Arg Ser Gly His Ala Arg Phe Arg Arg Gly Pro Val His Ser Pro 
65 70 75 80 



Pro Ser Ser Ser Val Pro Pro Pro Val Lys Val Thr Thr Pro Ala Pro 
85 90 95 
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Thr Gin He Ser Ala Pro Ala Pro Val Ser Phe Val Gin Ala Asn Gin 
100 105 HO 

. . 

Gin Ser Val Thr Leu Asp Phe Thr Arg Pro Ser Val Phe Gly Ala Lys 
115 120 125 

Thr Lys Ser Ser Glu Val Val Glu Phe Ala Lys Glu Ser Phe Ser Val 
130 135 140 

Ser Ser Asn Ser Ser Phe Met Ser Ser Ala He Thr Gly Asp Gly Ser 
145 150 155 160 

Val Ser Lys Gly Ser Ser He Phe Leu Ala Pro Ala Pro Ala Val Pro 
165 170 175 

Val Thr Ser Ser Gly Lys Pro Pro Leu Ser Gly Leu Pro Tyr Arg Lys 
180 185 190 

Arg Cys Phe Glu His Asp His Ser Glu Gly Phe Ser Gly Lys He Ser 
195 200 205 . 

• Gly Ser Gly Asn Gly Lys Cys His Cys Lys Lys Ser Arg Lys Asn Arg 
210 215 220 

Met Lys Arg Thr Val Arg Val Pro Ala Val Ser Ala Lys He Ala Asp 
225 ~ 230 235 240 

lie Pro Pro Asp Glu Tyr Ser Trp Arg Lys Tyr Gly Gin Lys Pro He 
245 250 255 

Lys Gly Ser Pro His Pro Arg Gly Tyr Tyr Lys Cys Ser Thr Phe Arg 
260 265 270 

Gly Cys Pro Ala Arg Lys His Val Glu Arg Ala Leu Asp Asp Ser Thr 
275 280 285 

Met Leu He Val Thr Tyr Glu Gly Glu His Arg His His Gin Ser Thr 
290 295 300 

Met Gin Glu His Val Thr Pro Ser Val Ser Gly Leu Val Phe Gly Ser 
305 310 315 320 

Ala 



<210> 51 

<211> 2310 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (179) . . (2065) 

<223> G941 



<400> 51 

tcttcttctt cttcctcttc ctcatctcgt atctctaact tttgtcgaag ttcttttgat 
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gaaactaggg tttattatct tctccttctt tttcccatca ccatagaaaa ggcagagacc 120 

tttttcttca tcatttttat tctccttctt cttctgctgt tcatttctcc aggttaca 178 

atg atg ttt aat gag atg gga atg tgt gga aac atg gat ttc ttc tct 226 
Met Met Phe Asn Glu Met Gly Met Cys Gly Asn Met Asp Phe Phe Ser 
1 5 .10 15 

tct gga tea ctt ggt gaa gtt gat ttc tgt. cct gtt cca caa get gag 274 
Ser Gly Ser Leu Gly Glu Val Asp Phe Cys Pro Val Pro Gin Ala Glu 
20 " 25 30 

cct gat tec att gtt gaa gat gac tat act gat gat gag att gat gtt 322 
Pro Asp Ser lie Val Glu Asp Asp Tyr Thr Asp Asp Glu lie Asp Val 
35 40 45 

gat gaa ttg gag agg agg atg tgg aga gac aaa atg egg ctt aaa cgt 3 70 

Asp Glu Leu Glu Arg Arg Met Trp Arg Asp Lys Met Arg Leu Lys Arg 
50 55 ~ . 60 

etc aag gag cag gat aag ggt aaa gaa ggt gtt gat get get aaa cag 418 
Leu Lys Glu Gin Asp Lys Gly Lys Glu Gly Val Asp Ala Ala Lys Gin 
65 " 70 75 80 

agg cag tct caa gag caa get agg agg aag aaa atg tct aga get caa 466 
Arg Gin Ser Gin Glu Gin Ala Arg Arg Lys Lys Met Ser Arg Ala Gin 
85 90 95 

9 at 999 atc ttg aag tat atg ttg aag atg atg gaa gtt tgt aaa get 514 
Asp Gly lie Leu Lys Tyr Met Leu Lys Met Met Glu Val Cys Lys Ala 
100 105 110 

caa ggc ttt gtt tat ggg att att ccg gag aat ggg aag cct gtg act 562 
Gin Gly Phe Val Tyr Gly lie lie Pro Glu Asn Gly Lys Pro Val Thr 
115 120 125 

ggt get tct gat aat tta agg gag tgg tgg aaa gat aag gtt agg ttt 610 
Gly Ala Ser Asp Asn Leu Arg Glu Trp Trp Lys Asp Lys Val Arg Phe 
130 135 140 

gat cgt aat ggt cct gcg get att ace aag tat caa gcg gag aat aat 658 
Asp Arg Asn Gly Pro Ala Ala lie Thr Lys Tyr Gin Ala Glu Asn Asn 
145 150 155 160 

ate ccg ggg att cat gaa ggt aat aac ccg att gga ccg act cct cat 706 
lie Pro Gly lie His Glu Gly Asn Asn Pro lie Gly Pro Thr Pro His 
165 170 175 

ace ttg caa gag ctt caa gac acg act ctt gga teg ctt ttg tct gcg 754 
Thr Leu Gin Glu Leu Gin Asp Thr Thr Leu Gly Ser Leu Leu Ser Ala 
180 185 190 

ttg atg caa cac tgt gat cct cct cag aga cgt ttt cct ttg gag aaa 802 
Leu Met Gin His Cys Asp Pro Pro Gin Arg Arg Phe Pro Leu Glu Lys 
195 200 205 

gga gtt cct cct ccg tgg tgg cct aat ggg aaa gag gat tgg tgg cct 850 
Gly Val Pro Pro Pro Trp Trp Pro Asn Gly Lys Glu Asp Trp Trp Pro 
210 215 220 

caa ctt ggt ttg cct aaa gat caa ggt cct gca cct tac aag aag cct 898 
Gin Leu Gly Leu Pro Lys Asp Gin Gly Pro Ala Pro Tyr Lys Lys Pro 
225 230 235 240 

cat gat ttg aag aag gcg tgg aaa gtc ggc gtt ttg act gcg gtt atc 946 
His Asp Leu Lys Lys Ala Trp Lys Val Gly Val Leu Thr Ala Val lie 
245 250 255 

aag cat atg ttt cct gat att get aag atc cgt aag etc gtg agg caa 994 
Lys His Met Phe Pro Asp lie Ala Lys lie Arg Lys Leu Val Arg Gin 
260 " 265 270 

tct aaa tgt ttg cag gat aag atg act get aaa gag agt get acc tgg 1042 
Ser Lys Cys Leu Gin Asp Lys Met Thr Ala Lys Glu Ser Ala Thr Trp 
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MBI-17 Sequence Listing. ST25 
280 285 



ctt get att att aac caa gaa gag tec ttg get aga gag ctt tat ccc 
Leu Ala He He Asn Gin Glu Glu Ser Leu Ala Arg Glu Leu Tyr Pro 
290 295 300 . 

gag tea tgt cca cct ctt tct ctg tct ggt gga agt tgc teg ctt ctg 
Glu Ser Cys Pro Pro Leu Ser Leu Ser Gly Gly Ser Cys Ser Leu Leu 
305 310 315 320 

atg aat gat tgc agt caa tac gat gtt gaa ggt ttc gag aag gag tct 
Met Asn Asp Cys Ser Gin Tyr Asp Val Glu Gly Phe Glu Lys Glu Ser 
325 330 . 335 

cac tat gaa gtg gaa gag etc aag cca gaa aaa gtt atg aat tct tea 
His Tyr Glu Val Glu Glu Leu Lys Pro Glu Lys Val Met Asn Ser Ser 
340 345 350 

aac ttt ggg atg gtt get aaa atg cat gac ttt cct gtc aaa gaa gaa 
Asn Phe Gly Met Val Ala Lys Met His Asp Phe Pro Val Lys Glu Glu 
355 360 365 

gtc cca gca gga aac teg gaa ttc atg aga aag aga aag cca aac aga 
Val Pro Ala Gly Asn Ser Glu Phe Met Arg Lys Arg Lys Pro Asn Arg 
370 375 380 

gat ctg aac act att atg gac aga ace gtt ttc ace tgc gag aat ctt 
Asp Leu Asn Thr He Met Asp Arg Thr Val Phe Thr Cys Glu Asn Leu 
385 390 395 400 

ggg tgt gcg cac age gaa ate age egg gga ttt ctg gat agg aat teg 
Gly Cys Ala His Ser Glu He Ser Arg Gly Phe Leu Asp Arg Asn Ser 
405 410 415 

aga gac aac cat caa ctg gca tgt cca cat cga gac agt cgc tta ccg 
Arg Asp Asn His Gin Leu Ala Cys Pro His Arg Asp Ser Arg Leu Pro 
420 425 430 

tat gga gca gca cca tec agg ttt cat gtc aat gaa gtt aag cct gta 
Tyr Gly Ala Ala Pro Ser Arg Phe His Val Asn Glu Val Lys Pro Val 
435 440 445 

gtt gga ttt cct cag cca agg cca gtg aac tea gta gee caa cca att 
Val Gly Phe Pro Gin Pro Arg Pro Val Asn Ser Val Ala Gin Pro He 
450 455 460 

gac tta acg ggt ata gtt cct gaa gat gga cag aag atg ate tea gag 
Asp Leu Thr Gly He Val Pro Glu Asp Gly Gin Lys Met He Ser Glu 
465 470 475 480 

etc atg tec atg tac gac aga aat gtc cag age aac caa ace tct atg 
Leu Met Ser Met Tyr Asp Arg Asn Val Gin Ser Asn Gin Thr Ser Met 
485 490 495 

gtc atg gaa aat caa age gtg tea ctg ctt caa ccc aca gtc cat aac 
Val Met Glu Asn Gin Ser Val Ser Leu Leu Gin Pro Thr Val His Asn 
500 505 510 

cat caa gaa cat etc cag ttc cca gga aac atg gtg gaa gga agt ttc 
His Gin Glu His Leu Gin Phe Pro Gly Asn Met Val Glu Gly Ser Phe 
515 520 525 

ttt gaa gac ttg aac ate cca aac aga gca aac aac aac aac age age 
Phe Glu Asp Leu Asn He Pro Asn Arg Ala Asn Asn Asn Asn Ser Ser 
530 535 540 

aac aat caa acg ttt ttt caa ggg aac aac aac aac aac aat gtg ttt 
Asn Asn Gin Thr Phe Phe Gin Gly Asn Asn Asn Asn Asn Asn Val Phe 
545 550 555 560 

aag ttc gac act gca gat cac aac aac ttt gaa get gca cat aac aac 
Lys Phe Asp Thr Ala Asp His Asn Asn Phe Glu Ala Ala His Asn Asn 
565 570 575 

aac aat aac agt age ggc aac agg ttc cag ctt gtg ttt gat tec aca 
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Asn Asn Asn Ser Ser Gly Asn Arg Phe Gin Leu Val Phe Asp Ser Thr 
580 585 590 

ccg ttc gac atg gcg tea ttc gat tac aga gat gat atg teg atg cca 2002 
Pro Phe Asp Met Ala Ser Phe Asp Tyr Arg Asp Asp Met Ser Met Pro 
595 600 * 605 

gga gta gta gga acg atg gat gga atg cag cag aag cag caa gat gta 2050 
Gly Val Val Gly Thr Met Asp Gly Met Gin Gin Lys Gin Gin Asp Val 
610 615 620 

tec ata tgg ttc taa agtcttggta gtagatttca tcttctctta tttttatctt 2105 

Ser lie Trp Phe 

625 

ttgtgttctt acattcactc aaccatgtaa tattttttcc tgggtctctc tgtctctatc 2165 

gcttgttatg atgtgtctgt aagagtctct aaaaactctc tgttactgtg tgtctttgtc 2225 

tcggcttggt gaatctctct gtcatcatca gcttttagtt acacacccga cttggggatg 2285 

aacgaacact aaatgtaagt tttca 2310 

<210> 52 
<211> 628 
<212> PRT 

<213> - Arabidopsis thaliana 



<400> 52 

le Asn 

5 10" 15 



Met Met Phe Asn Glu Met Gly Met Cys Gly Asn Met Asp Phe Phe Ser 



Ser Gly Ser Leu Gly Glu Val Asp Phe Cys Pro Val Pro Gin Ala Glu 
20 25 30 



Pro Asp Ser lie Val Glu Asp Asp Tyr Thr Asp Asp Glu lie Asp Val 
35 40 * 45 

Asp Glu Leu Glu Arg Arg Met Trp Arg Asp Lys Met Arg Leu Lys Arg 
50 55 60 



Leu Lys Glu Gin Asp Lys Gly Lys Glu Gly Val Asp Ala Ala Lys Gin 
65 70 75 80 



Arg Gin Ser Gin Glu Gin Ala Arg Arg Lys Lys Met Ser Arg Ala Gin 
85 90 95 



Asp Gly lie Leu Lys Tyr Met Leu Lys Met Met Glu Val Cys Lys Ala 
100 105 110 



Gin Gly Phe Val Tyr Gly He He Pro Glu Asn Gly Lys Pro Val Thr 
115 120 125 



Gly Ala Ser Asp Asn Leu Arg Glu Trp Trp Lys Asp Lys Val Arg Phe 
130 135 140 



Asp Arg Asn Gly Pro Ala Ala He Thr Lys Tyr Gin Ala Glu Asn Asn 
145 150 155 160 

He Pro Gly He His Glu Gly Asn Asn Pro He Gly Pro Thr Pro His 
165 170 175 
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Thr Leu Gin Glu Leu Gin Asp Thr Thr Leu Gly Ser Leu Leu Ser Ala 
180 " 185 190 

Leu Met Gin His Cys Asp Pro Pro Gin Arg Arg Phe Pro Leu Glu Lys 
195 200 205 

Gly Val Pro Pro Pro Trp Trp Pro Asn Gly Lys Glu Asp Trp Trp Pro 
210 . 215 220 

Gin Leu Gly Leu Pro Lys Asp Gin Gly Pro Ala Pro Tyr Lys Lys Pro 
225 * 230 235 240 

His Asp Leu Lys Lys Ala Trp Lys Val Gly Val Leu Thr Ala Val He 
245 250 255 

Lys His Met Phe Pro Asp He Ala Lys He Arg Lys Leu Val Arg Gin 
260 265 270 

Ser Lys Cys Leu Gin Asp Lys Met Thr Ala Lys Glu Ser Ala Thr Trp 
275 280 285 

Leu Ala He .He Asn Gin Glu Glu Ser Leu Ala Arg Glu Leu Tyr Pro 
290 295 300 

Glu Ser Cys Pro Pro Leu Ser Leu Ser Gly Gly Ser Cys Ser Leu Leu 
305 310 315 320 

Met Asn Asp Cys Ser Gin Tyr Asp Val Glu Gly Phe Glu Lys Glu Ser 
325 330 335 

His Tyr Glu Val Glu Glu Leu Lys Pro Glu Lys Val Met Asn Ser Ser 
340 345 350 

Asn Phe Gly Met Val Ala Lys Met His Asp Phe Pro Val Lys Glu Glu 
355 360 365 

Val Pro Ala Gly Asn Ser Glu Phe Met Arg Lys Arg Lys Pro Asn Arg 
370 375 380 

Asp Leu Asn Thr He Met Asp Arg Thr Val Phe Thr Cys Glu Asn Leu 
385 390 395 400 

Gly Cys Ala His Ser Glu He Ser Arg Gly Phe Leu Asp Arg Asn Ser 
405 410 415 

Arg Asp Asn His Gin Leu Ala Cys Pro His Arg Asp Ser Arg Leu Pro 
420 425 430 

Tyr Gly Ala Ala Pro Ser Arg Phe His Val Asn Glu Val Lys Pro Val 
435 440 445 

Val Gly Phe Pro Gin Pro Arg Pro Val Asn Ser Val Ala Gin Pro He 
450 455 460 

Asp Leu Thr Gly He Val Pro Glu Asp Gly Gin Lys Met He Ser Glu 
465 * 470 475 480 
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Leu Met Ser Met Tyr Asp Arg Asn Val Gin Ser Asn Glri Thr Ser Met 
485 490 495 

Val Met Glu Asn Gin Ser Val Ser Leu Leu Gin Pro Thr Val His Asn 
500 505 510 

His Gin Glu His Leu Gin Phe Pro Gly Asn Met Val Glu Gly Ser Phe 
515 520 525 



Phe Glu Asp Leu Asn lie Pro Asn Arg Ala Asn Asn Asn Asn Ser Ser 
530 535 *" 540 

Asn Asn Gin Thr Phe Phe Gin Gly Asn Asn Asn Asn Asn Asn Val Phe 
545 550 555 560 

Lys Phe Asp Thr Ala Asp His Asn Asn Phe Glu Ala Ala His Asn Asn 
565 " - 570 575 

Asn Asn Asn Ser Ser Gly Asn Arg Phe Gin Leu Val Phe Asp Ser Thr 
580 585 590 

Pro Phe Asp Met Ala Ser Phe Asp Tyr Arg Asp Asp Met Ser Met Pro 
595 600 " 605 

Gly Val Val Gly Thr Met Asp Gly Met Gin Gin Lys Gin Gin Asp Val 
610 615 620 

Ser lie Trp Phe 
625 

<210> 53 

<211> 1089 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (1)..(1089) 

<223> G198 

<400> 53 

atg gca agg tea cct tgt tgc gag aag aac gga etc aag aaa ggg cca 4 8 

Met Ala Arg Ser Pro Cys Cys Glu Lys Asn Gly Leu Lys Lys Gly Pro 
1 S 10 15 

tgg aca tct gaa gaa gac cag aag ctt gtt gac tat ate cag aaa cat 96 
Trp Thr Ser Glu Glu Asp Gin Lys Leu Val Asp Tyr lie Gin Lys His 
20 25 30 

ggt tat ggt aac tgg aga acc etc ccc aaa aat gee ggt acg tgt ttg 144 
Gly Tyr Gly Asn Trp Arg Thr Leu Pro Lys Asn Ala Gly Thr Cys Leu 
35 40 45 

caa aga tgt ggc aaa agt tgt agg tta agg tgg act aat tat etc cga 192 
Gin Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu Arg 
50 55 60 

cca gat ata aaa cga gga aga ttc tct ttt gag gaa gaa gaa gee att 240 
Pro Asp lie Lys Arg Gly Arg Phe Ser Phe Glu Glu Glu Glu Ala lie 
65 70 75 80 

att cag ctt cat age ttc tta gga aac aag tgg tct gcg att gcg gcg 288 
lie Gin Leu His Ser Phe Leu Gly Asn Lys Trp Ser Ala He Ala Ala 
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85 90 95 

cgt ttg cca gga aga aca gat aat gag ate aag aac ttt tgg aac act 336 
Arg Leu Pro Gly Arg Thr Asp Asn Glu lie Lys Asn Phe Trp Asn Thr 
100 105 110 

cat ata aga aag aag eta ctt aga atg ggg att gat cca gtg act cac 384 
His lie Arg Lys Lys Leu Leu Arg Met Gly He Asp Pro Val Thr His 
115 120 125 

agt cca cga etc gat etc etc gat ate tea tec ate tta get tea tct 432 
Ser Pro Arg Leu Asp Leu Leu Asp He Ser Ser He Leu Ala Ser Ser 
130 135 140 

eta tac aat tea tct tea cat cac atg aac atg tea aga etc atg atg 480 
Leu Tyr Asn Ser Ser Ser His His Met Asn Met Ser Arg Leu Met Met 
145 150 155 160 

gat act aat cgt cgt cat cag caa caa cat cca ttg gtt aac ccc gag 528 
Asp Thr Asn Arg Arg His Gin Gin Gin His Pro Leu Val Asn Pro Glu 
165 170 175 

ata etc aag ctt gcg acc tct ata ttc tct caa aac caa aac caa aac 576 
lie Leu Lys Leu Ala Thr Ser He Phe Ser Gin Asn Gin Asn Gin Asn 
180 185 190 

cac aac caa aat caa aac caa aac caa aac etc gtg gtg gat cat gag 624 
His Asn Gin Asn Gin Asn Gin Asn Gin Asn Leu Val Val Asp His Glu 
195 200 205 



aag caa aca gtt tat cat cat cat gat gtt aac caa acc gga gta aac 
Lys Gin Thr Val Tyr His His His Asp Val Asn Gin Thr Gly Val Asn 
210 215 220 



aac ttc gca aat teg gtc tta aac acg cca tec teg age ccg age ccg 

Asn Phe Ala Asn Ser Val Leu Asn Thr Pro Ser Ser Ser Pro Ser Pro 

305 310 315 320 

act acg tta aac teg agt tac ate aat agt age agt tgc age act gag 

Thr Thr Leu Asn Ser Ser Tyr He Asn Ser Ser Ser Cys Ser Thr Glu 

325 330 335 

gat gaa ata gaa age tat tgc agt aat etc atg aag ttt gat att ccc 

Asp Glu He Glu Ser Tyr Cys Ser Asn Leu Met Lys Phe Asp He Pro 

340 345 350 

gat ttc ttg gac gtt aat ggt ttt att ata taa 
Asp Phe Leu Asp Val Asn Gly Phe He He 
355 360 



672 



816 



caa tac caa acc gac caa tat ttc gag aac gcg att act caa gaa etc 720 

Gin Tyr Gin Thr Asp Gin Tyr Phe Glu Asn Ala He Thr Gin Glu Leu 

225 ~ 230 235 240 

caa tct tec atg cca cca ttc ccc aat gaa get cat cag ttt aac gac 768 

Gin Ser Ser Met Pro Pro Phe Pro Asn Glu Ala His Gin Phe Asn Asp 
245 250 255 

atg gat cat cac ttc aat ggt ttt gga gaa caa aat ctt gtt tea act 

Met Asp His His Phe Asn Gly Phe Gly Glu Gin Asn Leu Val Ser Thr 
260 265 270 

tct act acg tea gtc caa gat tgc tat aat ccg tea ttc aac gat tat 864 

Ser Thr Thr Ser Val Gin Asp Cys Tyr Asn Pro Ser Phe Asn Asp Tyr 
275 280 285 

tea agt tea aat ttt gtc tta gat cat tct tat teg gat cag age ttc 912 

Ser Ser Ser Asn Phe Val Leu Asp His Ser Tyr Ser Asp Gin Ser Phe 
290 295 300 



960 



1008 



1056 



1089 



<210> 54 
<211> 362 
<212> PRT 

<213> Arabidopsis thaliana 
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<400> 54 

Met Ala Arg Ser Pro Cys Cys Glu Lys Asn Gly Leu Lys Lys Gly Pro 
15 10 15 



Trp Thr Ser Glu Glu Asp Gin Lys Leu Val Asp Tyr lie Gin Lys His 
20 25 30 



Gly Tyr Gly Asn Trp Arg Thr Leu Pro Lys Asn Ala Gly Thr Cys Leu 
35 40 45 



Gin Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu Arg 
50 55 60 



Pro Asp lie Lys Arg Gly Arg Phe Ser Phe Glu Glu Glu Glu Ala lie 
65 70 75 80 



He Gin Leu His Ser Phe Leu Gly Asn Lys Trp Ser Ala He Ala Ala 
85 90 95 



Arg Leu Pro Gly Arg Thr Asp Asn Glu He Lys Asn Phe Trp Asn Thr 
100 105 110 



His He Arg Lys Lys Leu Leu Arg Met Gly lie Asp Pro Val Thr His 
115 120 125 



Ser Pro Arg Leu Asp Leu Leu Asp He Ser Ser He Leu Ala Ser Ser 
130 .135 140 



Leu Tyr Asn Ser Ser Ser His His Met Asn Met Ser Arg Leu Met Met 
145 150 155 " 160 



Asp Thr Asn Arg Arg His Gin Gin Gin His Pro Leu Val Asn Pro Glu 
165 170 175 



He Leu Lys Leu Ala Thr Ser He Phe Ser Gin Asn Gin Asn Gin Asn 
180 185 190 



His Asn Gin Asn Gin Asn Gin Asn Gin Asn Leu Val Val Asp His Glu 
195 200 205 



Lys Gin Thr Val Tyr His His His Asp Val Asn Gin Thr Gly Val Asn 
210 215 220 



Gin Tyr Gin Thr Asp Gin Tyr Phe Glu Asn Ala He Thr Gin Glu Leu 
225 ' 230 " 235 240 



Gin Ser Ser Met Pro Pro Phe Pro Asn Glu Ala His Gin Phe Asn Asp 
245 250 255 



Met Asp His His Phe Asn Gly Phe Gly Glu Gin Asn Leu Val Ser Thr 
260 265 270 



Ser Thr Thr Ser Val Gin Asp Cys Tyr Asn Pro Ser Phe Asn Asp Tyr 
275 280 285 
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Ser Ser Ser Asn Phe Val Leu Asp His Ser Tyr Ser Asp Gin Ser Phe 
290 295 300 



Asn Phe Ala Asn Ser Val Leu Asn Thr Pro Ser Ser Ser Pro Ser Pro 
305 .310 315 320 



Thr Thr Leu Asn Ser Ser Tyr lie Asn Ser Ser Ser Cys Ser Thr Glu 
325 330 335 



Asp Glu He Glu Ser Tyr Cys Ser Asn Leu Met Lys Phe Asp He Pro 
340 345 350 



Asp Phe Leu Asp Val Asn Gly Phe He He 
355 360 
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